Re: [PATCH] scsi_lib.c: sleeping function called from invalidcontext

From: James Bottomley
Date: Thu Sep 24 2009 - 21:23:47 EST


On Thu, 2009-09-24 at 15:56 -0700, Andrew Morton wrote:
> On Wed, 23 Sep 2009 17:58:47 +0000
> iceberg <strakh@xxxxxxxxx> wrote:
>
> > Driver scsi_lib.c might sleep in atomic context, because it calls
> > scsi_device_put under spin_lock_irqsave.
> > drivers/scsi/scsi_lib.c:356:
> > spin_lock_irqsave(shost->host_lock, flags);
> > scsi_device_put(sdev);
> > Path to might_sleep macro from scsi_device_put:
> > 1. scsi_device_put calls put_device at ./drivers/scsi/scsi.c:1111
> > 2. put_device calls kobject_put at ./drivers/base/core.c:1038
> > 3. kobject_put calls kref_put at ./lib/kobject.c
> > 4. kref_put may call callback function kobject_release at ./lib/kref.c if
> > refcount becomes zero, which might_sleep because it calls user event. Details:
> > 4.1 kobject_cleanup calls kobject_uevent at ./lib/kobject.c:555
> > 4.2 kobject_uevent calls kobject_uevent_env at ./lib/kobject_uevent.c:282
> > 4.3 kobject_uevent_env calls call_usermodehelper_exec at
> > ./include/linux/kmod.h:83
> > 4.4 call_usermodehelper_exec calls wait_for_completion at
> > ./kernel/kmod.c:481
> > 4.5 wait_for_completion calls wait_for_common at ./kernel/sched.c:5710
> > 4.5 wait_for_common calls might_sleep at ./kernels/sched.c:5692
> >
> > Found by Linux Driver Verification project.
> >
> > Delete wrong sleeping function calls.
> >
> > Signed-off-by: Alexander Strakh <strakh@xxxxxxxxx>
> >
> > ---
> > diff --git a/./a/drivers/scsi/scsi_lib.c b/./b/drivers/scsi/scsi_lib.c
> > index f3c4089..a8f8e2f 100644
> > --- a/./a/drivers/scsi/scsi_lib.c
> > +++ b/./b/drivers/scsi/scsi_lib.c
> > @@ -353,9 +353,9 @@ static void scsi_single_lun_run(struct scsi_device
> > *current_sdev)
> >
> > spin_unlock_irqrestore(shost->host_lock, flags);
> > blk_run_queue(sdev->request_queue);
> > - spin_lock_irqsave(shost->host_lock, flags);
> >
> > - scsi_device_put(sdev);
> > + scsi_device_put(sdev);
> > + spin_lock_irqsave(shost->host_lock, flags);
> > }
> > out:
> > spin_unlock_irqrestore(shost->host_lock, flags);
> >
>
> Well this is strange. afacit all the code to which you refer is
> ancient, so why did this bug just pop up now?

No idea. I think the root cause of this is in the kobject code: we
explicitly require the ability to call last put from interrupt context
(and that includes holding locks). I'll talks to Greg and Kai about
this (they're both here at plumbers). I think the fix is to indirect
the kobject uevent stuff via a usermode helper so we don't get this
problem.

James

---


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/