Re: block: Check that queue is alive in blk_insert_cloned_request()
From: Vivek Goyal
Date: Mon Jul 11 2011 - 21:47:48 EST
On Mon, Jul 11, 2011 at 09:22:06PM -0400, Mike Snitzer wrote:
> On Mon, Jul 11 2011 at 8:52pm -0400,
> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, 11 Jul 2011, Mike Snitzer wrote:
> >
> > > [cc'ing dm-devel, vivek and tejun]
> > >
> > > On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@xxxxxxxxxx> wrote:
> > > > From: Roland Dreier <roland@xxxxxxxxxxxxxxx>
> > > >
> > > > This fixes crashes such as the below that I see when the storage
> > > > underlying a dm-multipath device is hot-removed. ?The problem is that
> > > > dm requeues a request to a device whose block queue has already been
> > > > cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> > > > is alive, but rather goes ahead and tries to queue the request. ?This
> > > > ends up dereferencing the elevator that was already freed in
> > > > blk_cleanup_queue().
> > >
> > > Your patch looks fine to me:
> > > Acked-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> >
> > There's still the issue that Stefan Richter pointed out: The test for a
> > dead queue must be made _after_ acquiring the queue lock, not _before_.
>
> Yes, quite important.
>
> Jens, can you tweak the patch or should Roland send a v2?
I do not think that we should do queue dead check after taking a spinlock.
The reason being that there are life time issues of two objects.
- Validity of request queue pointer
- Validity of q->spin_lock pointer
If the dm has taken the reference to the request queue in the beginning
then it can be sure request queue pointer is valid. But spin_lock might
be coming from driver and might be in one of driver allocated structures.
So it might happen that driver has called blk_cleanup_queue() and freed
up structures which contained the spin lock.
So if queue is not dead, we know that q->spin_lock is valid. I think
only race present here is that whole operation is not atomic. First
we check for queue not dead flag and then go on to acquire request
queue lock. So this leaves a small window for race. I think I have
seen other code written in such manner (__generic_make_request()). So
it proably reasonably safe to do here too.
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/