Re: [PATCH] scsi_debug: deadlock between completions and surprise module removal

From: Douglas Gilbert
Date: Mon Sep 08 2014 - 16:31:18 EST


On 14-09-08 11:07 AM, Christoph Hellwig wrote:
On Mon, Sep 08, 2014 at 11:11:23AM +0200, Bart Van Assche wrote:
Hello Doug,

In the scsi_debug driver scsi_remove_host() is called from inside the
sdebug_driver_remove() callback function. Unless I have missed something it
is not guaranteed that that callback function is invoked before unloading of
the scsi_debug driver has finished. I think most of the code in
sdebug_driver_remove() should be moved to sdebug_remove_adapter().

I'm not sure that's right. scsi_debug uses the driver mode in a
slightly unusual way, and includes both the bus driver, device and
device driver.

sdebug_driver_remove is a bus method, but as we don't have driver methods
should act very much like all other _remove callbacks.

sdebug_remove_adapter is more a "bus-level" function that calls into
the driver model to unbind devices from the driver.

But we defintively shouldn't stop and free queued command before we
fully remove the hosts. As far as I can tell the stop_all_queued
call can be entirely removed from the remove path, as the midlayer
will take care of waiting for all commands to return, and the
free_all_queued should be after all hosts are unregistered.

stop_all_queued() is doing hrtimer_cancel(), del_timer_sync()
or tasklet_kill() on all the scsi_cmnd objects that are
"in play". Unless another mechanism calls the .eh_abort_handler
entry point reliably on each "in play" command then the module
cannot be removed. That is because some timer expiry callbacks
are pending.

Something like the (untested) patch below would do the trick.
We'd still need Dougs patch for the EH case, though.

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index d19c0e3..d022c2f 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -3983,14 +3983,13 @@ static void __exit scsi_debug_exit(void)
{
int k = scsi_debug_add_host;

- stop_all_queued();
- free_all_queued();
for (; k; k--)
sdebug_remove_adapter();
driver_unregister(&sdebug_driverfs_driver);
bus_unregister(&pseudo_lld_bus);
root_device_unregister(pseudo_primary);

+ free_all_queued();
if (dif_storep)
vfree(dif_storep);

--

The only other call to stop_all_queued() is from the
.eh_host_reset_handler entry point.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/