Re: [PATCH BUGFIX] block, bfq: access and cache blkg data only when safe

From: Tejun Heo
Date: Wed May 24 2017 - 10:50:33 EST


Hello, Paolo.

On Wed, May 24, 2017 at 12:53:26PM +0100, Paolo Valente wrote:
> Exact, but even after all blkgs, as well as the cfq_group and pd, are
> gone, the children cfq_queues of the gone cfq_group continue to point
> to unexisting objects, until new cfq_set_requests are executed for
> those cfq_queues. To try to make this statement clearer, here is the
> critical sequence for a cfq_queue, say cfqq, belonging to a cfq_group,
> say cfqg:
>
> 1 cfq_set_request for a request rq of cfqq
> 2 removal of (the process associated with cfqq) from bfqg
> 3 destruction of the blkg that bfqg is associated with
> 4 destruction of the blkcg the above blkg belongs to
> 5 destruction of the pd pointed to by cfqg, and of cfqg itself
> !!!-> from now on cfqq->cfqg is a dangling reference <-!!!
> 6 execution of cfq functions, different from cfq_set_request, on cfqq
> . cfq_insert, cfq_dispatch, cfq_completed_rq, ...
> 7 execution of a new cfq_set_request for cfqq
> -> now cfqq->cfqg is again a sane pointer <-
>
> Every function executed at step 6 sees a dangling reference for
> cfqq->cfqg.
>
> My fix for caching data doesn't solve this more serious problem.
>
> Where have I been mistaken?

Hmmm... cfq_set_request() invokes cfqg_get() which increases refcnt on
the blkg, which should pin everything down till the request is done,
so none of the above objects can be destroyed before the request is
done.

Thanks.

--
tejun