[PATCH v3 3/4] blk-cgroup: skip dying blkg in blkcg_activate_policy()

From: Yu Kuai

Date: Wed Jun 24 2026 - 22:58:20 EST


From: Zheng Qixing <zhengqixing@xxxxxxxxxx>

When switching IO schedulers on a block device, blkcg_activate_policy()
can race with concurrent blkcg deletion, leading to a use-after-free in
rcu_accelerate_cbs.

T1: T2:
blkg_destroy
kill(&blkg->refcnt) // blkg->refcnt=1->0
blkg_release // call_rcu(__blkg_release)
...
blkg_free_workfn
->pd_free_fn(pd)
elv_iosched_store
elevator_switch
...
iterate blkg list
blkg_get(blkg) // blkg->refcnt=0->1
list_del_init(&blkg->q_node)
blkg_put(pinned_blkg) // blkg->refcnt=1->0
blkg_release // call_rcu again
rcu_accelerate_cbs // uaf

Fix this by checking hlist_unhashed(&blkg->blkcg_node) before getting
a reference to the blkg. This is the same check used in blkg_destroy()
to detect if a blkg has already been destroyed. If the blkg is already
unhashed, skip processing it since it's being destroyed.

Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
Signed-off-by: Zheng Qixing <zhengqixing@xxxxxxxxxx>
Reviewed-by: Tang Yizhou <yizhou.tang@xxxxxxxxxx>
Signed-off-by: Yu Kuai <yukuai@xxxxxxx>
---
block/blk-cgroup.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fd1eed67924b..d06915045bc4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1575,10 +1575,12 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) {
struct blkg_policy_data *pd;

if (blkg->pd[pol->plid])
continue;
+ if (hlist_unhashed(&blkg->blkcg_node))
+ continue;

/* If prealloc matches, use it; otherwise try GFP_NOWAIT */
if (blkg == pinned_blkg) {
pd = pd_prealloc;
pd_prealloc = NULL;
--
2.51.0