Re: [PATCH 2/2] cgroup: Use separate work structs on css release path

From: Tejun Heo
Date: Wed Jun 01 2022 - 20:07:23 EST


Hello,

On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk wrote:
> What I'm trying to say is that it's not really a ref imbalance problem.
> I think once the kill_css() has been called on a css, and it is enqueued to be
> killed, we shouldn't call css_put(css) on it anymore outside of the "killed call
> flow path". It will all be handled by the css_killed_ref_fn() function.
>
> The fact the css_release() is called (via cgroup_kn_unlock()) just after
> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
> (cgroup_destroy_wq), just with different function. This results in the
> BUG: corrupted list in insert_work issue.

I have a hard time following here. The kill / release paths relationship
isn't that complicated. The kill path is invoked when the percpu_ref's base
ref is killed and holds an extra ref so that it's guaranteed that release
can't happen before the kill path is done with the css. When the final put
happens - whether that's from the kill path or someone else, which often is
the case - the release path triggers. If we have release getting scheduled
while the kill path isn't finished, it is a reference counting problem,
right?

Can you elaborate the exact scenario that you think is happening? Please
feel free to omit the function calls and all that. Just lay out who's doing
what.

Thanks.

--
tejun