Re: [patch V6 12/37] x86/entry: Provide idtentry_entry/exit_cond_rcu()

From: Thomas Gleixner
Date: Tue May 19 2020 - 16:20:30 EST


Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>> On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>> The pagefault handler cannot use the regular idtentry_enter() because that
>>> invokes rcu_irq_enter() if the pagefault was caused in the kernel. Not a
>>> problem per se, but kernel side page faults can schedule which is not
>>> possible without invoking rcu_irq_exit().
>>>
>>> Adding rcu_irq_exit() and a matching rcu_irq_enter() into the actual
>>> pagefault handling code would be possible, but not pretty either.
>>>
>>> Provide idtentry_entry/exit_cond_rcu() which calls rcu_irq_enter() only
>>> when RCU is not watching. The conditional RCU enabling is a correctness
>>> issue: A kernel page fault which hits a RCU idle reason can neither
>>> schedule nor is it likely to survive. But avoiding RCU warnings or RCU side
>>> effects is at least increasing the chance for useful debug output.
>>>
>>> The function is also useful for implementing lightweight reschedule IPI and
>>> KVM posted interrupt IPI entry handling later.
>>
>> Why is this conditional? That is, couldn't we do this for all
>> idtentry_enter() calls instead of just for page faults? Evil things
>> like NMI shouldn't go through this path at all.
>
> I thought about that, but then ended up with the conclusion that RCU
> might be unhappy, but my conclusion might be fundamentally wrong.

It's about this:

rcu_nmi_enter()
{
if (!rcu_is_watching()) {
make it watch;
} else if (!in_nmi()) {
do_magic_nohz_dyntick_muck();
}

So if we do all irq/system vector entries conditional then the
do_magic() gets never executed. After that I got lost...

Thanks,

tglx