Re: [PATCH] block: Fix lock unbalance caused by lock disconnect

From: Asias He
Date: Mon May 28 2012 - 21:48:50 EST


On 05/28/2012 06:20 PM, Tejun Heo wrote:
Hello, Asias.

On Mon, May 28, 2012 at 10:15:18AM +0800, Asias He wrote:
I don't think the patch description is correct. The lock switcihng is
inherently broken and the patch doesn't really fix the problem
although it *might* make the problem less likely. Trying to switch
locks while there are other accessors of the lock is simply broken, it
can never work without outer synchronization.

Since the lock switching is broken, is it a good idea to force all
the drivers to use the block layer provided lock? i.e. Change the
API from
blk_init_queue(rfn, driver_lock) to blk_init_queue(rfn). Any reason
not to use the block layer provided one.

I think hch tried to do that a while ago. Dunno what happened to the
patches. IIRC, the whole external lock thing was about sharing a
single lock across different request_queues. Not sure whether it's
actually beneficial enough or just a crazy broken optimization.

Do we have any existing use case of sharing a single lock across different request_queues? What's point of this sharing. Christoph?

If nobody has any objections I'd like to make the patches. Jens, any comments?

Your patch might make
the problem somewhat less likely simply because queue draining makes a
lot of request_queue users go away.

Who will use the request_queue after blk_cleanup_queue()?

Anyone who still holds a ref might try to issue a new request on a
dead queue. ie. blkdev with filesystem mounted goes away and the FS
issues a new read request after blk_cleanup_queue() finishes drainig.

OK. Thanks for explaining.


--
Asias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/