Re: lockup on Athlon systems, kernel race condition?

From: Manfred Spraul (manfred@colorfullife.com)
Date: Tue Sep 03 2002 - 16:46:11 EST


> Terence Ripperda wrote:
>>
>> ...
>>
>> asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
>> {
>> struct file * filp;
>> unsigned int flag;
>> int on, error = -EBADF;
>>
>> filp = fget(fd);
>> if (!filp)
>> goto out;
>> error = 0;
>> lock_kernel(); <====
Which compiler to you use, and which kernel? Which additional patches?

With my 2.4.20-pre4-ac1 kernel, the lock_kernel is at offset +3a,
according to your dump it's at +6a.

>> switch (cmd) {
>
> This CPU is spinning, waiting for kernel_flag. It will take the IPI
> and the other CPU's smp_call_function() will succeed.
>
> Possibly the IPI has got lost - seems that this is a popular failure mode
> for flakey chipsets/motherboards.
>
> Or someone has called sys_ioctl() with interrupts disabled. That's very
> doubtful.

Is it possible to display the cpu registers with kdb? Could you check
that the interrupts are enabled?

I'd add a quick test into sys_ioctl() or lock_kernel: save_flags, and
check that bit 9 is always enabled. Check __global_cli for sample code.
The X server probably runs with enough priveledges to disable the
interrupts, perhaps it's doing something stupid.

--
	Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 07 2002 - 22:00:19 EST