Re: uninterruptible sleep lockups
From: Anthony DiSante
Date: Mon Feb 21 2005 - 19:10:16 EST
Chris Friesen wrote:
It's indisputable that there will always be driver bugs and faulty
hardware. Of course these should be fixed, but if it's possible for
the kernel to gracefully deal with the bugs until they get fixed, then
why shouldn't it do so?
Think of the overhead required to track every single resource ever
aquired by the process/thread/entity in question. Then if/when it
hangs, you'd have to properly clean up every last one of them.
Yes, that would be difficult and expensive. But if permanently-D-stated
processes happened monthly on 50% of systems, then wouldn't it be worth it?
How about weekly on 10% of systems? The point is that at some point this
becomes worth considering, and with more people adding more new hardware to
their systems every day, this problem is becoming more and more frequent.
Much safer/simpler to leave it hung, and force an eventual reboot.
"Eventual" makes it sound far away, but the reality is that if part of your
USB subsystem is D-stated, then "eventually" means next time you want to use
your USB stick, or your printer, or your digital camera, or your MP3 device,
or... In other words, "eventually" means "right now, interrupting all your
current work."
If you have been given code that causes D states, bitch to the supplier
until they fix it.
The driver code for my devices has "been given" to me as part of the kernel.
Any of a handful of USB devices has caused permanent D states, as has a
CDROM and a NIC. I guess I'll need to start debugging all of these drivers.
When something goes into permanent D sleep, what should I do to start
tracking down the problem? Aside from obvious stuff like dmesg and checking
/var/log/messages, neither of which ever seems to say anything useful when
this happens.
Kernel bugs are not acceptable.
That's a nice-sounding ideal, but the truth is that kernel bugs exist and
are not uncommon.
-Anthony DiSante
http://nodivisions.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/