Re: [PATCH] cgroup: serialize css kill and release paths

From: Tadeusz Struk
Date: Fri Jun 03 2022 - 14:22:10 EST


On 6/3/22 10:34, Tadeusz Struk wrote:
Syzbot found a corrupted list bug scenario that can be triggered from
cgroup_subtree_control_write(cgrp). The reproduces writes to
cgroup.subtree_control file, which invokes:
cgroup_apply_control_enable()->css_create()->css_populate_dir(), which
then fails with a fault injected -ENOMEM.
In such scenario the css_killed_work_fn will be en-queued via
cgroup_apply_control_disable(cgrp)->kill_css(css), and bail out to
cgroup_kn_unlock(). Then cgroup_kn_unlock() will call:
cgroup_put(cgrp)->css_put(&cgrp->self), which will try to enqueue
css_release_work_fn for the same css instance, causing a list_add
corruption bug, as can be seen in the syzkaller report [1].

Fix this by synchronizing the css ref_kill and css_release jobs.
css_release() function will check if the css_killed_work_fn() has been
scheduled for the css and only en-queue the css_release_work_fn()
if css_killed_work_fn wasn't already en-queued. Otherwise css_release() will
set the CSS_REL_LATER flag for that css. This will cause the css_release_work_fn()
work to be executed after css_killed_work_fn() is finished.

Two scc flags have been introduced to implement this serialization mechanizm:

* CSS_KILL_ENQED, which will be set when css_killed_work_fn() is en-queued, and
* CSS_REL_LATER, which, if set, will cause the css_release_work_fn() to be
scheduled after the css_killed_work_fn is finished.

There is also a new lock, which will protect the integrity of the css flags.

[1]https://syzkaller.appspot.com/bug?id=e26e54d6eac9d9fb50b221ec3e4627b327465dbd

Cc: Tejun Heo<tj@xxxxxxxxxx>
Cc: Michal Koutny<mkoutny@xxxxxxxx>
Cc: Zefan Li<lizefan.x@xxxxxxxxxxxxx>
Cc: Johannes Weiner<hannes@xxxxxxxxxxx>
Cc: Christian Brauner<brauner@xxxxxxxxxx>
Cc: Alexei Starovoitov<ast@xxxxxxxxxx>
Cc: Daniel Borkmann<daniel@xxxxxxxxxxxxx>
Cc: Andrii Nakryiko<andrii@xxxxxxxxxx>
Cc: Martin KaFai Lau<kafai@xxxxxx>
Cc: Song Liu<songliubraving@xxxxxx>
Cc: Yonghong Song<yhs@xxxxxx>
Cc: John Fastabend<john.fastabend@xxxxxxxxx>
Cc: KP Singh<kpsingh@xxxxxxxxxx>
Cc:<cgroups@xxxxxxxxxxxxxxx>
Cc:<netdev@xxxxxxxxxxxxxxx>
Cc:<bpf@xxxxxxxxxxxxxxx>
Cc:<stable@xxxxxxxxxxxxxxx>
Cc:<linux-kernel@xxxxxxxxxxxxxxx>

Reported-and-tested-by:syzbot+e42ae441c3b10acf9e9d@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 8f36aaec9c92 ("cgroup: Use rcu_work instead of explicit rcu and work item")
Signed-off-by: Tadeusz Struk<tadeusz.struk@xxxxxxxxxx>

I just spotted an issue with this. I'm holding invalid lock in css_killed_work_fn().
I will follow up with a v2 of the patch soon.

--
Thanks,
Tadeusz