Re: uninterruptible sleep lockups

From: Anthony DiSante
Date: Mon Feb 21 2005 - 19:10:16 EST


Chris Friesen wrote:
It's indisputable that there will always be driver bugs and faulty hardware. Of course these should be fixed, but if it's possible for the kernel to gracefully deal with the bugs until they get fixed, then why shouldn't it do so?

Think of the overhead required to track every single resource ever aquired by the process/thread/entity in question. Then if/when it hangs, you'd have to properly clean up every last one of them.

Yes, that would be difficult and expensive. But if permanently-D-stated processes happened monthly on 50% of systems, then wouldn't it be worth it? How about weekly on 10% of systems? The point is that at some point this becomes worth considering, and with more people adding more new hardware to their systems every day, this problem is becoming more and more frequent.

Much safer/simpler to leave it hung, and force an eventual reboot.

"Eventual" makes it sound far away, but the reality is that if part of your USB subsystem is D-stated, then "eventually" means next time you want to use your USB stick, or your printer, or your digital camera, or your MP3 device, or... In other words, "eventually" means "right now, interrupting all your current work."

If you have been given code that causes D states, bitch to the supplier until they fix it.

The driver code for my devices has "been given" to me as part of the kernel. Any of a handful of USB devices has caused permanent D states, as has a CDROM and a NIC. I guess I'll need to start debugging all of these drivers. When something goes into permanent D sleep, what should I do to start tracking down the problem? Aside from obvious stuff like dmesg and checking /var/log/messages, neither of which ever seems to say anything useful when this happens.

Kernel bugs are not acceptable.

That's a nice-sounding ideal, but the truth is that kernel bugs exist and are not uncommon.

-Anthony DiSante
http://nodivisions.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/