Re: [PATCH v2] x86/idle: Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR

From: Uros Bizjak
Date: Fri Apr 04 2025 - 10:52:12 EST


On Fri, Apr 4, 2025 at 4:16 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 04/04/2025 3:14 pm, Uros Bizjak wrote:
> >
> >
> > On 2. 04. 25 19:24, Andrew Cooper wrote:
> >> Commit 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH
> >> workaround, add barriers") adds barriers, justified with:
> >>
> >> ... and add memory barriers around it since the documentation is
> >> explicit
> >> that CLFLUSH is only ordered with respect to MFENCE.
> >>
> >> This also triggered the same adjustment in commit
> >> f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched
> >> IPIs") during development, although it failed to get the
> >> static_cpu_has_bug()
> >> treatment.
> >>
> >> X86_BUG_CLFLUSH_MONITOR (a.k.a the AAI65 errata) is specific to Intel
> >> CPUs,
> >> and the SDM currently states:
> >>
> >> Executions of the CLFLUSH instruction are ordered with respect to
> >> each
> >> other and with respect to writes, locked read-modify-write
> >> instructions,
> >> and fence instructions[1].
> >>
> >> With footnote 1 reading:
> >>
> >> Earlier versions of this manual specified that executions of the
> >> CLFLUSH
> >> instruction were ordered only by the MFENCE instruction. All
> >> processors
> >> implementing the CLFLUSH instruction also order it relative to the
> >> other
> >> operations enumerated above.
> >>
> >> i.e. The SDM was incorrect at the time, and barriers should not have
> >> been
> >> inserted. Double checking the original AAI65 errata (not available from
> >> intel.com any more) shows no mention of barriers either.
> >>
> >> Note: If this were a general codepath, the MFENCEs would be needed,
> >> because
> >> AMD CPUs of the same vintage do sport otherwise-unordered
> >> CLFLUSHs.
> >>
> >> Furthermore, use a plain alternative, rather than
> >> static_cpu_has_bug() and/or
> >> no optimisation. The workaround is a single instruction.
> >>
> >> Use an explicit %rax pointer rather than a general memory operand,
> >> because
> >> MONITOR takes the pointer implicitly in the same way.
> >>
> >> Link:
> >> https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
> >> Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH
> >> workaround, add barriers")
> >> Fixes: f8e617f45829 ("sched/idle/x86: Optimize unnecessary
> >> mwait_idle() resched IPIs")
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >> ---
> >> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> >> CC: Ingo Molnar <mingo@xxxxxxxxxx>
> >> CC: Borislav Petkov <bp@xxxxxxxxx>
> >> CC: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> >> CC: x86@xxxxxxxxxx
> >> CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
> >> CC: linux-kernel@xxxxxxxxxxxxxxx
> >>
> >> v2:
> >> * Fix the same pattern in mwait_idle() too
> >> * Expand on why we're not using a general memory operand.
> >> ---
> >> arch/x86/include/asm/mwait.h | 11 +++++------
> >> arch/x86/kernel/process.c | 10 ++++------
> >> 2 files changed, 9 insertions(+), 12 deletions(-)
> >
> > There is another instance of the same sequence in
> > arch/x86/kernel/smpboot.c:
> >
> > /*
> > * The CLFLUSH is a workaround for erratum AAI65 for
> > * the Xeon 7400 series. It's not clear it is actually
> > * needed, but it should be harmless in either case.
> > * The WBINVD is insufficient due to the spurious-wakeup
> > * case where we return around the loop.
> > */
> > mb();
> > clflush(md);
> > mb();
> > __monitor(md, 0, 0);
> > mb();
> > __mwait(eax_hint, 0);
> >
> > Should this also be converted to the new sequence?
>
> Yes, it ought to.
>
> I'm OoO for a while though. If you fancy doing a patch, please go
> ahead. Or a maintainer could fold a 3rd hunk into this patch?

Heh, I learned the hard way not to touch the code I don't understand. ;)

Uros.