Re: [PATCH] scsi_debug: deadlock between completions and surprise module removal

From: Douglas Gilbert
Date: Fri Sep 05 2014 - 09:56:53 EST


With scsi-mq I think many LLDs probably have a new
race possibility between a surprise rmmod of the LLD
and another thread presenting a new command at about
the same time (or another thread's command completing
around that time). Does anything above the LLD stop
this happening?

Looking at mpt3sas and hpsa module exit calls, they don't
seem to guard against this possibility.

The test is pretty easy: build the LLD as a module, load
it and fire up a multi-thread, libaio fio test on one or
more devices (SSDs would probably be good) on that LLD.
While the test is running, do 'rmmod LLD'.

Doug Gilbert


On 14-09-05 01:24 AM, Christoph Hellwig wrote:
Can I get another review for this one?

On Sun, Aug 31, 2014 at 07:09:59PM -0400, Douglas Gilbert wrote:
A deadlock has been reported when the completion
of SCSI commands (simulated by a timer) was surprised
by a module removal. This patch removes one half of
the offending locks around timer deletions. This fix
is applied both to stop_all_queued() which is were
the deadlock was discovered and stop_queued_cmnd()
which has very similar logic.

This patch should be applied both to the lk 3.17 tree
and Christoph's drivers-for-3.18 tree.

Tested-and-reported-by: Milan Broz <gmazyland@xxxxxxxxx>
Signed-off-by: Douglas Gilbert <dgilbert@xxxxxxxxxxxx>

--- a/drivers/scsi/scsi_debug.c 2014-08-26 13:24:51.646948507 -0400
+++ b/drivers/scsi/scsi_debug.c 2014-08-30 18:04:54.589226679 -0400
@@ -2743,6 +2743,13 @@ static int stop_queued_cmnd(struct scsi_
if (test_bit(k, queued_in_use_bm)) {
sqcp = &queued_arr[k];
if (cmnd == sqcp->a_cmnd) {
+ devip = (struct sdebug_dev_info *)
+ cmnd->device->hostdata;
+ if (devip)
+ atomic_dec(&devip->num_in_q);
+ sqcp->a_cmnd = NULL;
+ spin_unlock_irqrestore(&queued_arr_lock,
+ iflags);
if (scsi_debug_ndelay > 0) {
if (sqcp->sd_hrtp)
hrtimer_cancel(
@@ -2755,18 +2762,13 @@ static int stop_queued_cmnd(struct scsi_
if (sqcp->tletp)
tasklet_kill(sqcp->tletp);
}
- __clear_bit(k, queued_in_use_bm);
- devip = (struct sdebug_dev_info *)
- cmnd->device->hostdata;
- if (devip)
- atomic_dec(&devip->num_in_q);
- sqcp->a_cmnd = NULL;
- break;
+ clear_bit(k, queued_in_use_bm);
+ return 1;
}
}
}
spin_unlock_irqrestore(&queued_arr_lock, iflags);
- return (k < qmax) ? 1 : 0;
+ return 0;
}

/* Deletes (stops) timers or tasklets of all queued commands */
@@ -2782,6 +2784,13 @@ static void stop_all_queued(void)
if (test_bit(k, queued_in_use_bm)) {
sqcp = &queued_arr[k];
if (sqcp->a_cmnd) {
+ devip = (struct sdebug_dev_info *)
+ sqcp->a_cmnd->device->hostdata;
+ if (devip)
+ atomic_dec(&devip->num_in_q);
+ sqcp->a_cmnd = NULL;
+ spin_unlock_irqrestore(&queued_arr_lock,
+ iflags);
if (scsi_debug_ndelay > 0) {
if (sqcp->sd_hrtp)
hrtimer_cancel(
@@ -2794,12 +2803,8 @@ static void stop_all_queued(void)
if (sqcp->tletp)
tasklet_kill(sqcp->tletp);
}
- __clear_bit(k, queued_in_use_bm);
- devip = (struct sdebug_dev_info *)
- sqcp->a_cmnd->device->hostdata;
- if (devip)
- atomic_dec(&devip->num_in_q);
- sqcp->a_cmnd = NULL;
+ clear_bit(k, queued_in_use_bm);
+ spin_lock_irqsave(&queued_arr_lock, iflags);
}
}
}

---end quoted text---


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/