Re: [PATCH 2/2] cgroup: Use separate work structs on css release path

From: Tadeusz Struk
Date: Wed Jun 01 2022 - 20:40:57 EST


On 6/1/22 17:29, Tejun Heo wrote:
On Wed, Jun 01, 2022 at 05:26:34PM -0700, Tadeusz Struk wrote:
Ok the problem is that

1. kill_css() triggers css_killed_ref_fn(), which enqueues &css->destroy_work on cgroup_destroy_wq
2. Last put_css() calls css_release(), which enqueues &css->destroy_work on cgroup_destroy_wq

We have two instances of the same work struct enqueued on the same WQ (cgroup_destroy_wq),
which causes "BUG: corrupted list in insert_work"

#2 shouldn't be happening before kill_ref_fn() is done with the css. If what
you're saying is happening, what's broken is the fact that the refcnt is
reaching 0 prematurely.

css_killed_ref_fn() will be called regardless of the value of refcnt (via percpu_ref_kill_and_confirm())
and it will only enqueue the css_killed_work_fn() to be called later.
Then css_put()->css_release() will be called before the css_killed_work_fn() will even
get a chance to run, and it will also *only* enqueue css_release_work_fn() to be called later.
The problem happens on the second enqueue. So there need to be something in place that
will make sure that css_killed_work_fn() is done before css_release() can enqueue
the second job. Does it sound right?
So I think the easiest way to solve this would be to have two separate work_structs,
one for the killed_ref path and css_release path as in:

If you do that, you'd just be racing the free path against the kill path and
the css might get freed while the kill path is still accessing it.

Thanks.



--
Thanks,
Tadeusz