Re: [BUG] msr-trace.h:42 suspicious rcu_dereference_check() usage!

From: Jiri Olsa
Date: Thu Feb 23 2017 - 07:24:44 EST


On Mon, Nov 21, 2016 at 10:28:50AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 21, 2016 at 01:53:43AM +0100, Jiri Olsa wrote:
> > hi,
> > Jan hit following output when msr tracepoints are enabled on amd server:
> >
> > [ 91.585653] ===============================
> > [ 91.589840] [ INFO: suspicious RCU usage. ]
> > [ 91.594025] 4.9.0-rc1+ #1 Not tainted
> > [ 91.597691] -------------------------------
> > [ 91.601877] ./arch/x86/include/asm/msr-trace.h:42 suspicious rcu_dereference_check() usage!
> > [ 91.610222]
> > [ 91.610222] other info that might help us debug this:
> > [ 91.610222]
> > [ 91.618224]
> > [ 91.618224] RCU used illegally from idle CPU!
> > [ 91.618224] rcu_scheduler_active = 1, debug_locks = 0
> > [ 91.629081] RCU used illegally from extended quiescent state!
> > [ 91.634820] no locks held by swapper/1/0.
> > [ 91.638832]
> > [ 91.638832] stack backtrace:
> > [ 91.643192] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc1+ #1
> > [ 91.649457] Hardware name: empty empty/S3992, BIOS 'V2.03 ' 05/09/2008
> > [ 91.656159] ffffc900018fbdf8 ffffffff813ed43c ffff88017ede8000 0000000000000001
> > [ 91.663637] ffffc900018fbe28 ffffffff810fdcd7 ffff880233f95dd0 00000000c0010055
> > [ 91.671107] 0000000000000000 0000000000000000 ffffc900018fbe58 ffffffff814297ac
> > [ 91.678560] Call Trace:
> > [ 91.681022] [<ffffffff813ed43c>] dump_stack+0x85/0xc9
> > [ 91.686164] [<ffffffff810fdcd7>] lockdep_rcu_suspicious+0xe7/0x120
> > [ 91.692429] [<ffffffff814297ac>] do_trace_read_msr+0x14c/0x1b0
> > [ 91.698349] [<ffffffff8106ddb2>] native_read_msr+0x32/0x40
> > [ 91.703921] [<ffffffff8103b2be>] amd_e400_idle+0x7e/0x110
> > [ 91.709407] [<ffffffff8103b78f>] arch_cpu_idle+0xf/0x20
> > [ 91.714720] [<ffffffff8181cd33>] default_idle_call+0x23/0x40
> > [ 91.720467] [<ffffffff810f306a>] cpu_startup_entry+0x1da/0x2b0
> > [ 91.726387] [<ffffffff81058b1f>] start_secondary+0x17f/0x1f0
> >
> >
> > it got away with attached change.. but this rcu logic
> > is far beyond me, so it's just wild guess.. ;-)
>
> I think I prefer something like the below, that only annotates the one
> RDMSR in question, instead of all of them.
>
>
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 0888a879120f..d6c6aa80675f 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -357,7 +357,7 @@ static void amd_e400_idle(void)
> if (!amd_e400_c1e_detected) {
> u32 lo, hi;
>
> - rdmsr(MSR_K8_INT_PENDING_MSG, lo, hi);
> + RCU_NONIDLE(rdmsr(MSR_K8_INT_PENDING_MSG, lo, hi));
>
> if (lo & K8_INTP_C1E_ACTIVE_MASK) {
> amd_e400_c1e_detected = true;

hum, I might have missed some other solution in discussion,
and can't see this one being pulled in.. should I resend this?

thanks,
jirka