Re: [BUG] lockup with the latest kernel

From: Tejun Heo
Date: Thu Aug 27 2009 - 22:53:37 EST


Tejun Heo wrote:
>>> Always happens where one CPU is sending an IPI and the other has the rq
>>> spinlock. Seems to be that the IPI expects the other CPU to not have
>>> interrupts disabled or something?
>
> I'm not too familiar with apics but AFAIK sending IPI isn't an
> interlocked operation (not at least at the software level) so I doubt
> it has much to do with the other cpu doing or not doing anything. It
> looks like the local apic is stuck hardware-wise. The only thing the
> commit changes is that cpu1 would be using vector 0xf1 instead of 0xf0
> together with cpu0.
>
> (reading the doc...) Okay, here's something interesting. It's from
> section 9.8.4 of intel doc 253668.pdf - Intel 64 and IA-32
> Architectures Software Developer's Manual Volume 3A: System
> Programming Guide, Part 1.
>
> For the P6 family and Pentium processors, the IRR and ISR registers
> can queue no more than two interrupts per priority level, and will
> reject other interrupts that are received within the same priority
> level.
>
> And from AMD's 24593 - AMD64 Architecture Programmer's Manual Volume
> 2: System Programming, section 16.6.3.
>
> No more than two interrupts can be pending for the same interrupt
> vector number. Subsequent interrupt requests to the same interrupt
> vector number will be rejected. See Figure 16-23 on page 445.
>

Oh... there are differences that I missed.

All intels: If more than one interrupt is generated with the same
vector number, the local APIC can set the bit for the
vector both in the IRR and the ISR. This means that for
the Pentium 4 and Intel Xeon processors, the IRR and ISR
can queue two interrupts for each interrupt vector: one
in the IRR and one in the ISR. Any additional interrupts
issued FOR THE SAME INTERRUPT VECTOR are COLLAPSED INTO
THE SINGLE BIT in the IRR.

Ppro: no more than two interrupts PER PRIORITY LEVEL, and will REJECT
OTHER interrupts

AMD64: Subsequent interrupt requests to THE SAME INTERRUPT VECTOR
NUMBER will be REJECTED.

Eh... don't have earlier AMD doc and gotta go now. Can somebody
please check? But it looks like we can deadlock by simply sending
RESCHEDULE_VECTOR more than two times while holding rq lock on AMD?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/