Re: Soft lockup regression from today's sched.git merge.

From: David Miller
Date: Wed Apr 23 2008 - 01:42:23 EST


From: Ingo Molnar <mingo@xxxxxxx>
Date: Tue, 22 Apr 2008 11:14:56 +0200

> thanks for reporting it. I havent seen this false positive happen in a
> long time - but then again, PC CPUs are a lot less idle than a 128-CPU
> Niagara2 :-/ I'm wondering what the best method would be to provoke a
> CPU to stay idle that long - to make sure this bug is fixed.

I looked more closely at this.

There is no way the patch in question can work properly.

The algorithm is, essentialy "if time - prev_cpu_time is large enough,
call __sync_cpu_clock()" which if fine, except that nothing ever sets
prev_cpu_time.

The code is fatally flawed, once __sync_cpu_clock() calls start
happening, they will happen on every cpu_clock() call.

So like my bisect showed from the get-go, these cpu_clock() changes
have major problems, so it was quite a mind boggling stretch to stick
a touch_softlockup_watchdog() call somewhere to try and fix this
when the guilty change in question didn't touch that area at all.
:-(

Furthermore, this is an extremely expensive way to ensure monotonic
per-rq timestamps. A global spinlock taken every 100000 ns on every
cpu?!?! :-/

At least move any implication of "high speed" from the comments above
cpu_clock() if we're going to need something like this. I have 128
cpus, that's 128 grabs of that spinlock every quantum. My next system
I'm getting will have 256 cpus. The expense of your solution
increases linearly with the number of cpus, which doesn't scale.

Anyways, I'll work on the group sched lockup bug next. As if I have
nothing better to do during the merge window than fix sched tree
regressions :-(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/