Re: docker crashes rcuos in __blkg_release_rcu

From: Tejun Heo
Date: Thu Jun 19 2014 - 16:26:49 EST


Sorry about the late reply.

On Wed, Jun 11, 2014 at 12:32:29PM -0400, Vivek Goyal wrote:
> Tejun, any thoughts on how to solve this issue. Delaying blkg release
> in rcu context and then expecting queue to be still present is causing
> this problem.

Heh, this is hilarious. If you look at the comment right above
__blkg_release_rcu(), it says

* A group is RCU protected, but having an rcu lock does not mean that one
* can access all the fields of blkg and assume these are valid. For
* example, don't try to follow throtl_data and request queue links.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

And yet the code brazenly derefs the ->q link to access the lock there
and causes oops. This is from 2a4fd070ee85 ("blkcg: move bulk of
blkcg_gq release operations to the RCU callback"). I stupidly didn't
realize what I was doing even while moving the comment itself.

Well, the obvious solution is making blkg ref an atomic. I was
planning to convert it to percpu_ref anyway. We can first convert it
to atomic_t for -stable and then to percpu_ref. Will prep a patch.

Thanks for tracking it down!

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/