Re: [patch] increase spinlock-debug looping timeouts from 1 sec to 1 min

From: Ingo Molnar
Date: Mon Jun 19 2006 - 07:43:24 EST



* Andrew Morton <akpm@xxxxxxxx> wrote:

> > The write_trylock + __delay in the loop is not a problem or a bug, as
> > the trylock will at most _increase_ the delay - and our goal is to not
> > have a false positive, not to be absolutely accurate about the
> > measurement here.
>
> Precisely. We have delays of over a second (but we don't know how
> much more than a second). Let's say two seconds. The NMI watchdog
> timeout is, what? Five seconds?

i dont see the problem. We'll have tried that lock hundreds of thousands
of times before this happens. The NMI watchdog will only trigger if we
do this with IRQs disabled. And it's not like the normal
__write_lock_failed codepath would be any different: for heavily
contended workloads the overhead is likely in the cacheline bouncing,
not in the __delay().

> That's getting too close. The result will be a total system crash.
> And RH are shipping this.

I dont see a connection. Pretty much the only thing the loop condition
impacts is the condition under which we print out a 'i think we
deadlocked' message. Have i missed your point perhaps?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/