Re: [PATCH 07/10] block: reorganize throtl_get_tg() andblk_throtl_bio()

From: Tejun Heo
Date: Wed Oct 19 2011 - 13:06:30 EST


Hello,

On Wed, Oct 19, 2011 at 10:56:22AM -0400, Vivek Goyal wrote:
> A driver could call blk_cleanup_queue(), mark the queue DEAD and then
> free the driver provided spin lock. So once queue is DEAD one could
> not rely on queue lock still being there. That's the reason I did
> not try to take queue lock again if queue is marked DEAD.
>
> Now I see the change that blk_cleanup_queue will start poiting to
> internal queue lock (Thought it is racy). This will atleast make
> sure that some spinlock is around. So now this change should be
> fine.

The problem with the current code is that all those are not properly
synchronized. Drivers shouldn't destroy lock or any other stuff until
blk_cleanup_queue() is complete and once queue cleanup is done block
layer shouldn't call out to driver.

Currently, the code has different opportunistic checks which can catch
most of those cases but unfortunatly I think it just makes the bugs
more obscure.

That said, we probably should be switching to internal lock once
clenaup is complete.

> > * blk_throtl_bio() indicates return status both with its return value
> > and in/out param **@bio. The former is used to indicate whether
> > queue is found to be dead during throtl processing. The latter
> > whether the bio is throttled.
> >
> > There's no point in returning DEAD check result from
> > blk_throtl_bio(). The queue can die after blk_throtl_bio() is
> > finished but before make_request_fn() grabs queue lock.
>
> The reason I was returning error in case of queue DEAD is that I
> wanted IO to now return with error instead of continuing to call
> q->make_request_fn(q, bio) which does not do queue dead check and
> assumes queue is still alive.
>
> With this change, if queue is DEAD, bio will not be throttled and we
> will continue to submit bio to queue and I am not sure who will catch
> it in __make_request()?

The same thing - all that the check in blk-throtl does is somewhat
reducing the race window - without it the window starts after the DEAD
check in generic_make_request_checks(). One way or the other, this
doesn't make much meaningful difference and I think it just obscures
the bug both in behavior and code (it's being check here, it gotta be
safe!). So, I just wanted to remove it before fixing it properly.

Thank you.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/