Re: [git pull] scheduler fixes

From: Ingo Molnar
Date: Fri Aug 01 2008 - 05:01:30 EST



* David Miller <davem@xxxxxxxxxxxxx> wrote:

> From: David Miller <davem@xxxxxxxxxxxxx>
> Date: Thu, 31 Jul 2008 15:55:04 -0700 (PDT)
>
> > I am absolutely sure, I spent the whole night yesterday trying to
> > debug this.
>
> Followup. I lost two days of my life debugging this because seemingly
> nobody can friggin' agree on what to do about the "printk() wakeup
> issue". Thanks!
>
> Can we fix this now, please?
>
> The problem was that Peter's patch triggers a print_deadlock_bug() in
> lockdep.c on the runqueue locks.
>
> But those printk()'s quickly want to do a wakeup, which wants to take
> the runqueue lock this thread already holds. So I would only get the
> first line of the lockdep debugging followed by a complete hang.

ugh. In the context of lockdep you are the first one to trigger this bug
- thanks for the fix!

We had a few other incidents of printks generated by bugs within the
scheduler code causing lockups (due to the wakeup) - and Steve Rostedt
sent a more generic solution for that: to first trylock the runqueue
lock in that case instead of doing an unconditional wakeup.

The patch made it to linux-next but Andrew NAK-ed that patch because it
caused other problems: it made printk wakeups conceptually less
reliable. (a spurious lock taken from another CPU could prevent a printk
wakeup from propagating and could block klogd indefinitely)

> Doing these wakeups in such a BUG message is unwise. Please can we
> apply something like the following and save other developers countless
> wasted hours of their time?
>
> Thanks :-)

applied to tip/core/locking.

I'm wondering, does this mean that Peter's:

lockdep: change scheduler annotation

is still not good enough yet?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/