Re: [PATCH 3/3] blk-cgroup: skip dying blkg in blkcg_activate_policy()

From: Zheng Qixing

Date: Fri Jan 09 2026 - 03:47:12 EST



在 2026/1/9 0:22, Yu Kuai 写道:
Hi,

在 2026/1/8 9:44, Zheng Qixing 写道:
From: Zheng Qixing <zhengqixing@xxxxxxxxxx>

When switching IO schedulers on a block device, blkcg_activate_policy()
can race with concurrent blkcg deletion, leading to a use-after-free of
the blkg.

T1: T2:
elv_iosched_store blkg_destroy
elevator_switch kill(&blkg->refcnt) // blkg->refcnt=0
... blkg_release // call_rcu
blkcg_activate_policy __blkg_release
list for blkg blkg_free
blkg_free_workfn
->pd_free_fn(pd)
blkg_get(blkg) // blkg->refcnt=0->1
list_del_init(&blkg->q_node)
kfree(blkg)
blkg_put(pinned_blkg) // blkg->refcnt=1->0
blkg_release // call_rcu again
call_rcu(..., __blkg_release)
This stack is not clear to me, can this problem be fixed by protecting
q->blkg_list iteration with blkcg_mutex as I said in patch 2?
It appears that adding blkcg_mutex still cannot resolve the issue where the same blkg has its
reference count decremented to 0 twice.
However, it does fix the memory leak caused by pd_alloc_fn() succeeding for a blkg that has
already been removed.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index af468676cad1..ac7702db0836 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1645,9 +1645,10 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
* GFP_NOWAIT failed. Free the existing one and
* prealloc for @blkg w/ GFP_KERNEL.
*/
Why this check is not done before pd_alloc_fn()? What if pd_alloc_fn() succeed for
removed blkg?

I will fix this memory leak issue in the next revision.


Thank,

Qixing