On Mon, May 29, 2000 at 07:03:35PM +0100, Alan Cox wrote:
> > Is this the "disable plugging" fix? Or something else?
>
> Disable plugging. Its not merged because nobody has yet explained precisely
> why it works and if its the right solution.
OK, I'll explain why I think it works. Taking a typical example of the loop
device hosted on an ide device (or a file on an ide device):
With plugging enabled, you can end up with a situation where the tq_disk task
queue looks like this:
tq_disk -> loop_tq -> ide_tq
(where loop_tq and ide_tq are the task_queue members of the appropriate request
queues)
When tq_disk is run, loop's do_request gets run first, which will submit some
I/O to the ide device (possibly indirectly through the fs code), kick tq_disk
again and then sleep while waiting for it to complete (wait_on_buffer /
wait_on_page). However, the initial run_task_queue will have set tq_disk to
NULL, so when loop runs it, nothing happens, and it sleeps forever, leaving
loop and ide "unreachable" through tq_disk.
IIRC, however, Jens tried this solution and still saw deadlocks, this time
in the get_request() stuff, which suggests we've got two different problems.
The solution seems to me to be a choice between:
- disable plugging in loop, and whatever kludge is required to avoid the other
deadlock
- fix loop to stop it sleeping in do_request - stick the pending I/O on
another queue which can be serviced by a separate thread (we might be able to
avoid this for block device-backed loops)
- fix the block device layer to be more friendly to "stacked" devices (loop,
lvm, md, etc.).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed May 31 2000 - 21:00:22 EST