Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

From: Linas Vepstas
Date: Fri Jan 02 2009 - 22:45:31 EST


2009/1/2 Duane Griffin <duaneg@xxxxxxxxx>:
> On Fri, Jan 02, 2009 at 06:21:14PM -0600, Chris Adams wrote:
>> Once upon a time, Linas Vepstas <linasvepstas@xxxxxxxxx> said:
>> > Below follows a summary of the reported crashes. I'm ignoring the
>> > zillions of "mine didn't crash" reports, or the "you're a paranoid
>> > conspiracy theorist, its random chance" reports.
>>
>> I have reproduced this and got a stack trace (this is with Fedora 8 and
>> kernel kernel-2.6.26.6-49.fc8.x86_64):
>>
>> Basically (to my untrained eye), the leap second code is called from the
>> timer interrupt handler, which holds xtime_lock. The leap second code
>> does a printk to notify about the leap second. The printk code tries to
>> wake up klogd (I assume to prioritize kernel messages), and (under some
>> conditions), the scheduler attempts to get the current time, which tries
>> to get xtime_lock => deadlock.
>
> How about just moving the printk out of the lock? I.e. something like
> this:

[...]

Sure looks like the right fix to me. (Although there's more than
one printk under that lock). Who's going to write the formal patch?

--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/