Re: "blocked for more than 120 secs" --> a valid situation, how toprevent?

From: Mark Lord
Date: Thu Sep 23 2010 - 23:51:30 EST


On 10-09-23 10:53 PM, Mark Lord wrote:
On 10-09-23 08:05 PM, Douglas Gilbert wrote:
Mark,
If you issued the SG_IO ioctl with a timeout of at
least 66 minutes (expressed in milliseconds) then
it looks like ata_scsi_queuecmd() has a problem.
..

Mmm.. more like blk_execute_rq() perhaps.
That's where the wait_for_completion(&wait) call is at.

Perhaps I should change it to wait in smaller increments,
so that the lockup detection doesn't trigger on it..
..

This patch (below) seems to work.

Does this look kosher enough for me to roll it up
as a proper patch submission? Jens? Joel?

The problem, again, is that the hangcheck timer fires
inappropriately during very long SG_IO commands,
such as --security-erase operations which take minutes/hours to complete.

Thanks

--- old/block/blk-exec.c 2010-08-26 19:47:12.000000000 -0400
+++ linux/block/blk-exec.c 2010-09-23 23:41:47.478826002 -0400
@@ -95,7 +95,8 @@
rq->end_io_data = &wait;
blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq);
- wait_for_completion(&wait);
+ while (!wait_for_completion_timeout(&wait, (sysctl_hung_task_timeout_secs >> 1) * HZ))
+ ; /* periodic wakeup prevents "hung_task" warnings */
if (rq->errors)
err = -EIO;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/