Re: [v3.10-rc1] WARNING: at kernel/rcutree.c:502

From: Srivatsa S. Bhat
Date: Tue May 14 2013 - 03:50:00 EST


On 05/14/2013 01:08 PM, BjÃrn Mork wrote:
> "Srivatsa S. Bhat" <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> writes:
>> On 05/13/2013 08:09 PM, BjÃrn Mork wrote:
>>
>>> Hey, hey, hey. Turns out this wasn't that wrong after all. That merge
>>> includes a oneline diff in kernel/cpu/idle.c and it *is* actually this
>>> diff which trigger the problem for me. Reverting it, using the attached
>>> patch, makes the warning go away. Which means that it had nothing to do
>>> with your RCU changes.
>>>
>>> But I haven't the faintest idea how this is supposed to work, or even
>>> how to explain the patch properly, so I think I need some help from
>>> Thomas here. Unless this makes you understand the real issue?
>>>
>>> Thomas, why does powertop trigger the
>>>
>>> WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common.isra.48+0x3d/0x125()
>>>
>>> without the attached patch? And what is the proper resolution?
>>>
>>
>> The problem appears to be in the cpu idle poll implementation. You can trigger
>> this problem by passing idle=poll in the kernel cmd-line as well, right?
>
> That sounded so obvious that it made me think "Doh, why didn't I just
> test that before?" But unfortunately there must be some other factor
> involved. No warnings observed during normal use when running with
> idle=poll:
>

I didn't expect warnings with normal use.

> bjorn@nemi:~$ dmesg|grep polling
> [ 0.000000] process: using polling idle threads
>
>
> I expected a flood of warnings here, but there is none until I start
> powertop (to confirm that the original issue is still there). So it's
> more than just entering cpu_idle_poll().
>

Yeah, of course it is :-) The warning triggers only when you enable the tracepoint
in the idle code. And in your case, powertop does that. That's why it only
triggers when you run powertop. Alternatively, if you enable the tracepoint
yourself manually, I bet you'll see the warnings, even without using powertop.

>> I think I understand what is going on here. Can you please try the fix below?
>> (It is only compile-tested since its very late here and I really need to get
>> some sleep!).
>
> Works perfect. Thanks.

Thanks for your testing!

> I assume this is the correct fix even if the
> problem isn't completely understood?
>

Hmm? Why do you say the problem isn't completely understood? I thought I
explained the problem in my changelog. Did I miss something?

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/