Re: [PATCH] rtc: fix deadlock

From: Maciej W. Rozycki
Date: Sat Aug 23 2008 - 14:47:20 EST


On Sat, 23 Aug 2008, Ingo Molnar wrote:

> > The sporadic nature of the bug means that it will probably take a
> > couple of days of testing and dozens of reboots w/o problems before
> > I'm confident to say that the problem's been fixed.
>
> btw., i've seen that lockup too on a testbox - but havent been able to
> track it down yet. You went further and presented the bug on a plate by
> tracking it down to the loop in get_rtc_time() - so now we can both test
> the fix and see whether the lockups are gone.

Hmm, waiting for 20ms with interrupts disabled sounds like a bad idea --
a lot of system timer ticks may be lost and some devices may not be happy
about such a delay either (serial?). I think your proposal only papers
over the actual bug and it should simply be forbidden to call this
function with interrupts disabled.

I can see this is a part of the legacy RTC interface, so except from the
long delay the function itself may have a chance to work with interrupts
disabled. But with the modern RTC library this is no longer true as some
RTC chips it supports can only be accessed with interrupts enabled,
because their accessor functions may sleep. This is for example common
with RTC chips wired through I2C.

Therefore I think to avoid propagating the design problem with places
calling get_rtc_time() when they eventually switch to the RTC library it
would be better to put:

BUG_ON(irqs_disabled());

into get_rtc_time() now and fix the callers which happen to trigger it.

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/