Re: work queue of scsi fc transports should be serialized
From: Bart Van Assche
Date: Fri May 19 2017 - 18:35:57 EST
On Fri, 2017-05-19 at 09:36 +0000, Dashi DS1 Cao wrote:
> It seems there is a race of multiple "fc_starget_delete" of the same rport,
> thus of the same SCSI host. The race leads to the race of scsi_remove_target
> and it cannot be prevented by the code snippet alone, even of the most recent
> version:
> spin_lock_irqsave(shost->host_lock, flags);
> list_for_each_entry(starget, &shost->__targets, siblings) {
> if (starget->state == STARGET_DEL ||
> starget->state == STARGET_REMOVE)
> continue;
> If there is a possibility that the starget is under deletion(state ==
> STARGET_DEL), it should be possible that list_next_entry(starget, siblings)
> could cause a read access violation.
Hello Dashi,
Something else must be going on. From scsi_remove_target():
restart:
spin_lock_irqsave(shost->host_lock, flags);
list_for_each_entry(starget, &shost->__targets, siblings) {
if (starget->state == STARGET_DEL ||
starget->state == STARGET_REMOVE)
continue;
if (starget->dev.parent == dev || &starget->dev == dev) {
kref_get(&starget->reap_ref);
starget->state = STARGET_REMOVE;
spin_unlock_irqrestore(shost->host_lock, flags);
__scsi_remove_target(starget);
scsi_target_reap(starget);
goto restart;
}
}
spin_unlock_irqrestore(shost->host_lock, flags);
In other words, before scsi_remove_target() decides to call
__scsi_remove_target(), it changes the target state into STARGET_REMOVE
while holding the host lock. This means that scsi_remove_target() won't
call __scsi_remove_target() twice and also that it won't invoke
list_next_entry(starget, siblings) after starget has been freed.
Bart.