Re: [BUG] msr-trace.h:42 suspicious rcu_dereference_check() usage!

From: Thomas Gleixner
Date: Tue Nov 29 2016 - 09:02:17 EST


On Tue, 29 Nov 2016, Borislav Petkov wrote:
> On Mon, Nov 21, 2016 at 05:06:54PM +0100, Borislav Petkov wrote:
> > IOW, what's the worst thing that can happen if we did this below?
> >
> > We basically get rid of the detection and switch the timer to broadcast
> > mode immediately on the halting CPU.
> >
> > amd_e400_idle() is behind an "if (cpu_has_bug(c, X86_BUG_AMD_APIC_C1E))"
> > check so it will run on the affected CPUs only...
> >
> > Thoughts?
>
> Actually, here's a better version. The E400 detection works only after
> ACPI has been enabled so we piggyback the end of acpi_init().
>
> We don't need the MSR read now - we do
>
> if (static_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
>
> on the idle path which is as fast as it gets.
>
> Any complaints about this before I go and test it everywhere?

The issue is that you obvioulsy start with the assumption, that the machine
has this bug. As a consequence the machine is brute forced into tick
broadcast mode, which cannot be reverted when you clear that misfeature
after ACPI init. So in case of !NOHZ and !HIGHRES the periodic tick is
forced into broadcast mode, which is not what you want.

As far as I understood the whole magic, this C1E misfeature takes only
effect _after_ ACPI has been initialized. So instead of setting the bug in
early boot and therefor forcing the broadcast nonsense, we should only set
it when ACPI has actually detected it.

Thanks,

tglx