Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

From: Wanpeng Li
Date: Wed Oct 12 2016 - 07:07:21 EST


2016-09-19 16:10 GMT+08:00 Peter Zijlstra <peterz@xxxxxxxxxxxxx>:
> On Thu, Sep 15, 2016 at 10:58:04AM +0200, Thomas Gleixner wrote:
>> On Thu, 15 Sep 2016, Wanpeng Li wrote:
>> > ---
>> > arch/x86/include/asm/apic.h | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> > index 1243577..71c1fe2 100644
>> > --- a/arch/x86/include/asm/apic.h
>> > +++ b/arch/x86/include/asm/apic.h
>> > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
>> >
>> > static inline void ipi_entering_ack_irq(void)
>> > {
>> > - ack_APIC_irq();
>> > irq_enter();
>> > + ack_APIC_irq();
>> > }
>>
>> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
>> therefor pointless.
>
> entering_ack_irq() seems to use entering_irq() instead of irq_enter().
> Which is close but not the same. This thing seems to also do
> exit_idle().
>
> Now, there's only a handfull of ipi_entering_ack_irq() users, and it
> doesn't seem to make sense to me to only call exit_idle() on IPIs, why
> don't we need to call exit_idle() on regular IRQs ?!
>
> All in all, that stuff is crufty and needs a cleanup I'd say.

[ 116.587762]
[ 116.587768] ===============================
[ 116.587770] [ INFO: suspicious RCU usage. ]
[ 116.587773] 4.8.0+ #24 Not tainted
[ 116.587775] -------------------------------
[ 116.587777] ./arch/x86/include/asm/msr-trace.h:47 suspicious
rcu_dereference_check() usage!
[ 116.587779]
[ 116.587779] other info that might help us debug this:
[ 116.587779]
[ 116.587782]
[ 116.587782] RCU used illegally from idle CPU!
[ 116.587782] rcu_scheduler_active = 1, debug_locks = 0
[ 116.587785] RCU used illegally from extended quiescent state!
[ 116.587787] no locks held by swapper/1/0.
[ 116.587788]
[ 116.587788] stack backtrace:
[ 116.587792] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #24
[ 116.587794] Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03
01/08/2015
[ 116.587796] ffff90285de03f58 ffffffff9d44a0c9 ffff90285ca5d100
0000000000000001
[ 116.587803] ffff90285de03f88 ffffffff9d0ebd67 ffff902845165410
000000000000080b
[ 116.587809] 0000000000000000 0000000000000000 ffff90285de03fb8
ffffffff9d492b95
[ 116.587814] Call Trace:
[ 116.587817] <IRQ> [<ffffffff9d44a0c9>] dump_stack+0x99/0xd0
[ 116.587827] [<ffffffff9d0ebd67>] lockdep_rcu_suspicious+0xe7/0x120
[ 116.587832] [<ffffffff9d492b95>] do_trace_write_msr+0x135/0x140
[ 116.587836] [<ffffffff9d06f860>] native_write_msr+0x20/0x30
[ 116.587841] [<ffffffff9d065fad>] native_apic_msr_eoi_write+0x1d/0x30
[ 116.587845] [<ffffffff9d05bd1d>] smp_reschedule_interrupt+0x1d/0x30
[ 116.587849] [<ffffffff9d8daec6>] reschedule_interrupt+0x96/0xa0
[ 116.587851] <EOI> [<ffffffff9d732634>] ? cpuidle_enter_state+0xe4/0x360
[ 116.587858] [<ffffffff9d73261f>] ? cpuidle_enter_state+0xcf/0x360
[ 116.587861] [<ffffffff9d7328e7>] cpuidle_enter+0x17/0x20
[ 116.587865] [<ffffffff9d0e1a73>] call_cpuidle+0x23/0x50
[ 116.587868] [<ffffffff9d0e1d0c>] cpu_startup_entry+0x15c/0x280
[ 116.587872] [<ffffffff9d05ce64>] start_secondary+0x154/0x180

irq_enter() which is called in scheduler_ipi() is too late to tell RCU
susbstems to end the extended quiescent state before ack_APIC_irq(),
any ideas?

Regards,
Wanpeng Li