Re: [PATCH BUGFIX] block, bfq: access and cache blkg data only when safe

From: Tejun Heo
Date: Wed May 24 2017 - 12:47:13 EST


Hello,

On Wed, May 24, 2017 at 05:43:18PM +0100, Paolo Valente wrote:
> > so none of the above objects can be destroyed before the request is
> > done.
>
> ... the issue seems just to move to a more subtle position: cfq is ok,
> because it protects itself with rq lock, but blk-mq schedulers don't.
> So, the race that leads to the (real) crashes reported by people may
> actually be:

Oh, I was just thinking about !mq paths the whole time.

> 1 blkg_lookup executed on a blkg being destroyed: the scheduler gets a
> copy of the content of the blkg, but the rcu mechanism doesn't prevent
> destruction from going on
> 2 blkg_get gets executed on the copy of the original blkg

So, we can't do that. We should look up and bump the ref and use the
original copy. We probably should switch blkgs to use percpu-refs.

Thanks.

--
tejun