Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks

From: Mark Lord
Date: Wed Dec 05 2007 - 17:31:42 EST


Arjan van de Ven wrote:
On Mon, 3 Dec 2007 11:27:15 +0100
Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

Kernel waiting 2 minutes on TASK_UNINTERRUPTIBLE is certainly
broken.
What should it do when the NFS server doesn't answer anymore or when the network to the SAN RAID array located a few hundred KM away
develops some hickup? Or just the SCSI driver decides to do lengthy
error recovery -- you could argue that is broken if it takes longer than 2 minutes, but in practice these things are hard to test
and to fix.


the scsi layer will have the IO totally aborted within that time anyway;
the retry timeout for disks is 30 seconds after all.
..

Mmm.. but the SCSI layer may do many retries, each with 30sec timeouts..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/