Re: [PATCH 11/13] arm: npcm: drop selecting non-existing ARM_ERRATA_794072
From: Avi Fishman
Date: Tue Nov 02 2021 - 04:33:49 EST
On Tue, Nov 2, 2021 at 10:22 AM Avi Fishman <avifishman70@xxxxxxxxx> wrote:
>
> Hi,
>
> At Nuvoton we implied this WA in the past, not because we encountered
> a specific problem but since the errata says so and we saw this in
> other patches like:
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/1396298072-13254-2-git-send-email-nitin.garg@xxxxxxxxxxxxx/
> But we didn't upstream the arch/arm/mm/proc-v7.S
>
> From the errata document.
> "794072 A short loop including a DMB instruction might cause a denial
> of service on
> another processor which executes a CP15 broadcast operation
> Status
> Affects:
> Fault Type:
> Fault Status: Cortex-A9 MPCore.
> Programmer Category B
> Present in: All r1, r2, r3 and r4 revisions Open
> Description
> A processor which continuously executes a short loop containing a DMB
> instruction might prevent a CP15
> operation broadcast by another processor making further progress,
> causing a denial of service.
> Configurations affected
> This erratum affects all Cortex-A9 MPCore processors with two or more
> processors.
> Conditions
> The erratum requires the following conditions:
> - Two or more processors are working in SMP mode (ACTLR.SMP=1)
> - One of the processors continuously executes a short loop containing
> at least one DMB instruction.
> - Another processor executes a CP15 maintenance operation that is
> broadcast. This requires that this
> processor has enabled the broadcasting of CP15 operations (ACTLR.FW=1)
> For the erratum to occur, the short loop containing the DMB
> instruction must meet all of the following additional
> conditions:
> - No more than 10 instructions other than the DMB are executed between each DMB
> - No non-conditional Load or Store, or conditional Load or Store which
> pass the condition code check, are
> executed between each DMB
> When all the conditions for the erratum are met, the short loop
> creates a continuous stream of DMB instructions.
> This might cause a denial of service, by preventing the processor
> executing the short loop from executing the
> received broadcast CP15 operation. As a result, the processor that
> originally executed the broadcast CP15
> operation is stalled until the execution of the loop is interrupted.
> Note that because the process issuing the CP15 broadcast operation
> cannot complete operation, it cannot enter
> any debug-mode, and cannot take any interrupt. If the processor
> executing the short loop also cannot be
> interrupted, for example if it has disabled its interrupts, or if no
> interrupts are routed to this processor, this
> erratum might cause a system livelock.
> Implications
> The erratum might create performance issues, or in the worst case it
> might cause a system livelock if the
> processor executing the DMB is in an infinite loop that cannot be interrupted.
> Workaround
> This erratum can be worked round by setting bit[4] of the undocumented
> Diagnostic Control Register to 1. This
> register is encoded as CP15 c15 0 c0 1.
> This bit can be written in Secure state only, with the following
> Read/Modify/Write code sequence:
> MRC p15,0,rt,c15,c0,1
> ORR rt,rt,#0x10
> MCR p15,0,rt,c15,c0,1
> When it is set, this bit causes the DMB instruction to be decoded and
> executed like a DSB.
> Using this software workaround is not expected to have any impact on
> the overall performance of the processor
> on a typical code base.
> Other workarounds are also available for this erratum, to either
> prevent or interrupt the continuous stream of
> DMB instructions that causes the deadlock. For example:
> - Inserting a non-conditional Load or Store instruction in the loop
> between each DMB
> - Inserting additional instructions in the loop, such as NOPs, to
> avoid the processor seeing back to back
> DMB instructions.
> - Making the processor executing the short loop take regular interrupts."
>
> Avi
>
> On Tue, Nov 2, 2021 at 9:31 AM Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> wrote:
> >
> > On Fri, Oct 29, 2021 at 8:36 AM Joel Stanley <joel@xxxxxxxxx> wrote:
> > >
> > > On Thu, 28 Oct 2021 at 14:57, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > >
> > > > On Thu, Oct 28, 2021 at 4:19 PM Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> wrote:
> > > > >
> > > > > There is no and never was a Kconfig for ARM_ERRATA_794072 in the kernel
> > > > > tree. So, there is no need to select ARM_ERRATA_794072 in
> > > > > ./arch/arm/mach-npcm/Kconfig.
> > > > >
> > > > > Simply drop selecting the non-existing ARM_ERRATA_794072.
> > > > >
> > > > > This issue was discovered with ./scripts/checkkconfigsymbols.py.
> > > > >
> > > > > Signed-off-by: Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx>
> > > > > ---
> > > >
> > > > Could this be a typo? Maybe we need to enable a different errata workaround
> > > > here, or maybe that code is actually needed and has to get sent.
> > >
> > > Doing some searching, u-boot had a workaround for something called
> > > ARM_ERRATA_794072.
> > >
> > > https://github.com/u-boot/u-boot/commit/f71cbfe3ca5d2ad20159871700e8e248c8818ba8
> > >
> > > Lore has the review history for that patch:
> > >
> > > https://lore.kernel.org/all/6be32e0b5b454ed7b609317266a8e798@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> > >
> > > It looks like it's the same workaround as ARM_ERRATA_742230, which the
> > > kernel does implement.
> > >
> > > It would be good to hear from the Nuvoton people, or an Arm person.
> > >
> >
> > I will happily update the patch to select ARM_ERRATA_742230 instead of
> > the dead non-existing ARM_ERRATA_794072.
> >
> > In contrast to the current patch that basically only cleans up "dead
> > config" and has no effective functional change, the new patch would
> > change the behaviour. I cannot test this patch (beyond some basic
> > compile test) on the hardware; so, we certainly need someone to have
> > that hardware, knows how to test it or confirm otherwise that we
> > should select the ARM_ERRATA_742230 fix for this hardware.
Note that ARM_ERRATA_742230 is applied in code up-to CORTEX-A9 r2p2
but while ARM_ERRATA_794072 exist also in CORTEX-A9 r4p1
https://github.com/torvalds/linux/blob/322a3b843d7f475b857646ed8f95b40431d3ecd0/arch/arm/mm/proc-v7.S#L347
> >
> > The current patch should be subsumed by the new patch; the submission
> > of the new patch is deferred until that person shows up. Let's see.
> >
> > Lukas
>
>
>
> --
> Regards,
> Avi
--
Regards,
Avi