Re: [PATCH 1/3] [lockup detector] sync touch_*_watchdog back to oldsemantics

From: Don Zickus
Date: Wed Sep 01 2010 - 11:51:35 EST


Top posting because droid won't let me bottom post

This patch was the result of a regression with acpi and preempt. Akpm asked that I not change the semantics of the old touch_nmi_watchdog. So I tried to revert to the old behaviour.

Sorry for not properly explaining that.

Cheers,
Don

Ingo Molnar <mingo@xxxxxxx> wrote:

>
>* Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
>
>> On 9/1/10, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
>> > On 9/1/10, Ingo Molnar <mingo@xxxxxxx> wrote:
>> >>
>> >> * Don Zickus <dzickus@xxxxxxxxxx> wrote:
>> >>
>> >>> void touch_nmi_watchdog(void)
>> >>> {
>> >>> - __get_cpu_var(watchdog_nmi_touch) = true;
>> >>> + if (watchdog_enabled) {
>> >>> + unsigned cpu;
>> >>> +
>> >>> + for_each_present_cpu(cpu) {
>> >>> + if (per_cpu(watchdog_nmi_touch, cpu) != true)
>> >>> + per_cpu(watchdog_nmi_touch, cpu) = true;
>> >>> + }
>> >>
>> >> Hm, this is going to be a scalability nightmare with lots of CPUs. Not
>> >> only do we have a nr_cpus loop, but we touch per-cpu areas of _other_
>> >> CPUs - a big scalability nono.
>> >>
>> >> Why do we need to do this? We never needed to touch other CPU's NMI
>> >> lockup accounting data areas - why has this changed? The changelog does
>> >> not explain this.
>> >>
>> >> Thanks,
>> >>
>> >> Ingo
>> >>
>> > I believe this came from old nmi watchdog code where it might be
>> > useful when nmi watchdog activated via io-apic. I'm trying to figure
>> > out if we really need it still.
>>
>> Well, we can't drop it or make per-cpu specific, for example we need
>> it in case of panic with watchdog enabled and panic timeout set, or
>> boot delay set and etc. Seems same applies to printk_delay. Hmm...
>
>Ok - can you cite the old watchdog code, did it really do a nr_cpus
>loop?
>
>Thanks,
>
> Ingo
N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i