RE: endless loop in native_flush_tlb_others in smp_64.c

From: Pallipadi, Venkatesh
Date: Tue Mar 11 2008 - 10:00:23 EST




>-----Original Message-----
>From: chunkeey@xxxxxx [mailto:chunkeey@xxxxxx]
>Sent: Tuesday, March 11, 2008 3:31 AM
>To: Jike Song
>Cc: Linux Kernel; Pallipadi, Venkatesh; Ingo Molnar; Thomas
>Gleixner; Brown, Len
>Subject: Re: endless loop in native_flush_tlb_others in smp_64.c
>
>On Tuesday 11 March 2008 10:55:40 Jike Song wrote:
>> On Tue, Mar 11, 2008 at 6:16 AM, Chr <chunkeey@xxxxxx> wrote:
>> > Hi,
>> > ever since I moved to 2.6.25-rcY (Y should be between 2 and 5!).
>> > I've seen several, but really hard-to-catch instant freezes on my
>> > AMD64 Athlon X2 4200+ system...
>>
>> Here I guess 2.6.24 is fine for you?
>Yes, 2.6.24(.3) is fine!
>
>> > Most of them happend in X.org so at first I thought it
>had something to
>> > do with the NVIDIA module... BUT, one time it froze "a way
>before" the
>> > module could get loaded...
>> >
>> > ---
>> > SYSRQ-P revealed that the CPU were looping inside:
>> >
>> > smp_64.c native_flush_tlb_others:
>> > assembler code:
>> > < 1ee: f3 90 pause
>> > < 1f0: f6 45 00 03 testb $0x3,0x0(%rbp)
>> > < 1f4: 75 f8 jne 1ee
>> > <native_flush_tlb_others+0x5f>
>> >
>> > also known as: (in C)
>> >
>> > while (!cpus_empty(f->flush_cpumask))
>> > cpu_relax();
>> >
>> > So... has anyone a good idea what's happening here
>exactly? Or is there
>> > already another topic or even a patch available?
>>
>> Would you please attach your config file? Do you have
>CONFIG_CPU_IDLE set?
>(Attached). Yes, CONFIG_CPU_IDLE is enabled! I guess I should
>now disable it,
>try again and report back, right?! ;-)

I don't think it is related to CPU_IDLE. SYSRQ-T that Thomas asked
should give more clues.

Thanks,
Venki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/