Re: [PATCH v6 1/1] MIPS: Fix idle VS timer enqueue
From: Marco Crivellari
Date: Wed Mar 26 2025 - 05:46:41 EST
I'm mostly thinking about future changes in this part of the code.
But if it is ok what we have now, and future changes are not a
problem, let's keep this version.
Would this be ok with you @Maciej?
If so, the region is now 40 bytes. This is what I did yesterday:
@@ -110,6 +110,7 @@ LEAF(__r4k_wait)
.set noreorder
/* Start of idle interrupt region. */
MFC0 t0, CP0_STATUS
+ nop
/* Enable interrupt. */
ori t0, 0x1f
xori t0, 0x1e
@@ -128,7 +129,11 @@ LEAF(__r4k_wait)
*/
wait
/* End of idle interrupt region. */
-1:
+__r4k_wait_exit:
+ /* Check idle interrupt region size. */
+ .if ((__r4k_wait_exit - __r4k_wait) != 40)
+ .error "Idle interrupt region size mismatch: expected 40 bytes."
+ .endif
jr ra
nop
.set pop
@@ -139,10 +144,10 @@ LEAF(__r4k_wait)
.set push
.set noat
MFC0 k0, CP0_EPC
- PTR_LA k1, 1b
- /* 36 byte idle interrupt region. */
+ PTR_LA k1, __r4k_wait_exit
+ /* 40 byte idle interrupt region. */
ori k0, 0x1f
- PTR_ADDIU k0, 5
+ PTR_ADDIU k0, 9
bne k0, k1, \handler
MTC0 k0, CP0_EPC
.set pop
Under QEMU is working, but I'm not so sure the latest part is correct.
Thanks!
On Wed, Mar 26, 2025 at 2:20 AM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
>
> On Tue, Mar 25, 2025 at 10:09 PM Marco Crivellari
> <marco.crivellari@xxxxxxxx> wrote:
> >
> > Hi Maciej,
> >
> > Thanks a lot for all the information.
> >
> > > Unlike `__r4k_wait' that code is not in a `.set noreorder' region and
> > > the assembler will therefore resolve the hazard by inserting a NOP where
> > > required by the architecture level requested (with `-march=...', etc.).
> > > Try compiling that function for a MIPS III configuration such as
> > > decstation_r4k_defconfig or just by hand with `-march=mips3' and see
> > > what machine code is produced.
> >
> > I tried with the configuration you suggested, and indeed I can see a NOP.
> >
> > > Whatever manual you quote it refers to MIPS Release 2, which is only
> > > dated 2003
> >
> > About the MIPS manual, anyhow, it is "MIPS32 M4 Processor Core" (year 2008).
> > Maybe I've also picked the wrong manual.
> >
> > I've also found the manual you mentioned online, thanks.
> >
> > > Best is to avoid using a `.set noreorder' region in the first place.
> > > But is it really needed here? Does the rollback area have to be of a
> > > hardcoded size rather than one calculated by the assembler based on
> > > actual machine code produced? It seems to me having it calculated would
> > > reduce complexity here and let us use the EI instruction where available
> > > as well.
> >
> > Well, considering the complexity and how the code looks fragile even with
> > a small change, yes, that's likely better to avoid noreorder.
> In my opinion keeping "noreorder" is the simplest, which means just
> add an "nop" after MFC0 in the current version.
>
> Huacai
>
> >
> > I think I'm going to need some guidance here.
> > Please, correct me where something is wrong.
> >
> > 1)
> > When you say "let us use the EI instruction where available" are you
> > referring to do
> > something like below?
> >
> > #if defined(CONFIG_CPU_HAS_DIEI)
> > ei
> > #else
> > MFC0 t0, CP0_STATUS
> > ori t0, 0x1f
> > xori t0, 0x1e
> > MTC0 t0, CP0_STATUS
> > #endif
> >
> > 2)
> > Removing "noreorder" would let the compiler add "nops" where they are needed.
> > But that still means the 3 ssnop and ehb are still needed, right?
> >
> > My subsequent dumb question is: there is the guarantee that the
> > compiler will not
> > reorder / change something we did?
> > This question also came after reading the manual you quoted (paragraph
> > "Coprocessor Hazards"):
> >
> > "For example, after an mtc0 to the Status register which changes an
> > interrupt mask bit,
> > there will be two further instructions before the interrupt is
> > actually enabled or disabled.
> > [...]
> > To cope with these situations usually requires the programmer to take explicit
> > action to prevent the assembler from scheduling inappropriate
> > instructions after a
> > dangerous mtc0. This is done by using the .set noreorder directive,
> > discussed below"
> >
> > 3)
> > Considering the size is determined by the compiler, the check about
> > the idle interrupt
> > size region should not be needed, correct?
> >
> > 4)
> > ori and PTR_ADDIU should be removed of course from the rollback handler macro.
> > Can I have some hints about the needed change?
> > Using QEMU is always working, so I'm not sure if what I will change is
> > also correct.
> >
> >
> > Many thanks in advance, also for your time!
> >
> >
> >
> >
> > On Fri, Mar 21, 2025 at 9:11 PM Maciej W. Rozycki <macro@xxxxxxxxxxx> wrote:
> > >
> > > On Fri, 21 Mar 2025, Marco Crivellari wrote:
> > >
> > > > > This instruction sequence still suffers from the coprocessor move delay
> > > > > hazard. How many times do I need to request to get it fixed (counting
> > > > > three so far)?
> > > >
> > > > Can I have more details about this?
> > > >
> > > > I can see it is the same code present also in local_irq_enable()
> > > > (arch_local_irq_enable()),
> > >
> > > Unlike `__r4k_wait' that code is not in a `.set noreorder' region and
> > > the assembler will therefore resolve the hazard by inserting a NOP where
> > > required by the architecture level requested (with `-march=...', etc.).
> > > Try compiling that function for a MIPS III configuration such as
> > > decstation_r4k_defconfig or just by hand with `-march=mips3' and see
> > > what machine code is produced.
> > >
> > > > and from the manual I've seen:
> > > >
> > > > "The Spacing column shown in Table 2.6 and Table 2.7 indicates the
> > > > number of unrelated instructions (such as NOPs or SSNOPs) that,
> > > > prior to the capabilities of Release 2, would need to be placed
> > > > between the producer and consumer of the hazard in order to ensure
> > > > that the effects of the first instruction are seen by the second instruction."
> > > >
> > > > The "Spacing column" value is 3, indeed.
> > > >
> > > > "With the hazard elimination instructions available in Release 2, the
> > > > preferred method to eliminate hazards is to place one of the
> > > > instructions listed in Table 2.8 between the producer and consumer of the
> > > > hazard. Execution hazards can be removed by using the EHB [...]"
> > >
> > > Whatever manual you quote it refers to MIPS Release 2, which is only
> > > dated 2003 (following Release 1 from 2001), but `__r4k_wait' has to
> > > continue handling older architecture revisions going back to MIPS III
> > > ISA from 1991. We also support MIPS I ISA from 1985 which has further
> > > instruction scheduling requirements, but `__r4k_wait' is for newer
> > > processors only, because older ones had no WAIT instruction, so we can
> > > ignore them (but note that the MIPS I load delay slot is regardless
> > > observed in current code in the form of a NOP inserted after LONG_L).
> > >
> > > > What am I missing?
> > >
> > > This is common MIPS knowledge really, encoded in the GNU toolchain and
> > > especially GAS since forever. While I can't cite a canonical reference,
> > > the hazard is listed e.g. in Table 13.1 "Instructions with scheduling
> > > implications" and Table 13.3 "R4xxx/R5000 Coprocessor 0 Hazards" from
> > > "IDT MIPS Microprocessor Family Software Reference Manual," Version 2.0,
> > > from October 1996. I do believe the document is available online.
> > >
> > > I'm fairly sure the hazard is also listed there in Dominic Sweetman's
> > > "See MIPS Run Linux," but I don't have my copy handy right now.
> > >
> > > Best is to avoid using a `.set noreorder' region in the first place.
> > > But is it really needed here? Does the rollback area have to be of a
> > > hardcoded size rather than one calculated by the assembler based on
> > > actual machine code produced? It seems to me having it calculated would
> > > reduce complexity here and let us use the EI instruction where available
> > > as well.
> > >
> > > HTH,
> > >
> > > Maciej
> >
> >
> >
> > --
> >
> > Marco Crivellari
> >
> > L3 Support Engineer, Technology & Product
> >
> >
> >
> >
> > marco.crivellari@xxxxxxxx
--
Marco Crivellari
L3 Support Engineer, Technology & Product
marco.crivellari@xxxxxxxx