Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

From: peterz
Date: Tue Aug 11 2020 - 05:47:25 EST


On Tue, Aug 11, 2020 at 11:20:54AM +0200, peterz@xxxxxxxxxxxxx wrote:
> On Tue, Aug 11, 2020 at 10:38:50AM +0200, Jürgen Groß wrote:
> > In case you don't want to do it I can send the patch for the Xen
> > variants.
>
> I might've opened a whole new can of worms here. I'm not sure we
> can/want to fix the entire fallout this release :/
>
> Let me ponder this a little, because the more I look at things, the more
> problems I keep finding... bah bah bah.

That is, most of these irq-tracking problem are new because commit:

859d069ee1dd ("lockdep: Prepare for NMI IRQ state tracking")

changed irq-tracking to ignore the lockdep recursion count.

This then allows:

lock_acquire()
raw_local_irq_save();
current->lockdep_recursion++;
trace_lock_acquire()
... tracing ...
#PF under raw_local_irq_*()

__lock_acquire()
arch_spin_lock(&graph_lock)
pv-spinlock-wait()
local_irq_save() under raw_local_irq_*()

However afaict that just made a bad situation worse. There already were
issues, take for example:

trace_clock_global()
raw_local_irq_save();
arch_spin_lock()
pv-spinlock-wait
local_irq_save()

And that has no lockdep_recursion to 'save' the say.

The tracing recursion does however avoid some of the obvious fails
there, like trace_clock calling into paravirt which then calls back into
tracing. But still, that would've caused IRQ tracking problems even with
the old code.

And in that respect, this is all the exact same problem as that other
set of patches has ( 20200807192336.405068898@xxxxxxxxxxxxx ).

Now, on the flip side, it does find actual problems, the trace_lock_*()
things were using RCU in RCU-disabled code, and here I found that
trace_clock_global() thinkg (and I suspect there's more of that).

But at this point I'm not entirelty sure how best to proceed... tracing
uses arch_spinlock_t, which means all spinlock implementations should be
notrace, but then that drops into paravirt and all hell breaks loose
because Hyper-V then calls into the APIC code etc.. etc..

At that rate we'll have the entire kernel marked notrace, and I'm fairly
sure that's not a solution either.

So let me once again see if I can't find a better solution for this all.
Clearly it needs one :/