Re: [PATCH] scsi_debug: deadlock between completions and surprise module removal

From: Douglas Gilbert
Date: Sat Sep 06 2014 - 10:40:25 EST


On 14-09-05 11:25 AM, Bart Van Assche wrote:
On 09/05/14 15:56, Douglas Gilbert wrote:
With scsi-mq I think many LLDs probably have a new
race possibility between a surprise rmmod of the LLD
and another thread presenting a new command at about
the same time (or another thread's command completing
around that time). Does anything above the LLD stop
this happening?

Looking at mpt3sas and hpsa module exit calls, they don't
seem to guard against this possibility.

The test is pretty easy: build the LLD as a module, load
it and fire up a multi-thread, libaio fio test on one or
more devices (SSDs would probably be good) on that LLD.
While the test is running, do 'rmmod LLD'.

An LLD must call scsi_remove_host() directly or indirectly from the module
cleanup path. scsi_remove_host() triggers a call to blk_cleanup_queue(). That
last function sets the flag QUEUE_FLAG_DYING which prevents that new I/O is
queued and waits until previously queued requests have finished before returning.

And they do call scsi_remove_host(). But they do that toward
the end of their clean-up. The problem that I observed has
already happened before that.

IOW I think the QUEUE_FLAG_DYING state needs to be set and
acknowledged as the first order of business by the code
that implements 'rmmod LLD'.

Doug Gilbert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/