Re: [PATCH 3/3] blk-cgroup: skip dying blkg in blkcg_activate_policy()

Next message: Menglong Dong: "Re: [PATCH bpf-next v3 1/3] bpf, x86: inline bpf_get_current_task() for x86_64"
Previous message: Stefan Eichenberger: "Re: [PATCH RESEND net-next v2] net: stmmac: dwmac: Add a fixup for the Micrel KSZ9131 PHY"
In reply to: Yu Kuai: "Re: [PATCH 3/3] blk-cgroup: skip dying blkg in blkcg_activate_policy()"
Next in thread: Zheng Qixing: "[PATCH 2/3] blk-cgroup: fix uaf in blkcg_activate_policy() racing with blkg_free_workfn()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Zheng Qixing

Date: Fri Jan 09 2026 - 03:47:12 EST

在 2026/1/9 0:22, Yu Kuai 写道:

Hi,

在 2026/1/8 9:44, Zheng Qixing 写道:

From: Zheng Qixing <zhengqixing@xxxxxxxxxx>

When switching IO schedulers on a block device, blkcg_activate_policy()
can race with concurrent blkcg deletion, leading to a use-after-free of
the blkg.

T1: T2:
elv_iosched_store blkg_destroy
elevator_switch kill(&blkg->refcnt) // blkg->refcnt=0
... blkg_release // call_rcu
blkcg_activate_policy __blkg_release
list for blkg blkg_free
blkg_free_workfn
->pd_free_fn(pd)
blkg_get(blkg) // blkg->refcnt=0->1
list_del_init(&blkg->q_node)
kfree(blkg)
blkg_put(pinned_blkg) // blkg->refcnt=1->0
blkg_release // call_rcu again
call_rcu(..., __blkg_release)

This stack is not clear to me, can this problem be fixed by protecting
q->blkg_list iteration with blkcg_mutex as I said in patch 2?

It appears that adding blkcg_mutex still cannot resolve the issue where the same blkg has its
reference count decremented to 0 twice.
However, it does fix the memory leak caused by pd_alloc_fn() succeeding for a blkg that has
already been removed.

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index af468676cad1..ac7702db0836 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1645,9 +1645,10 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
* GFP_NOWAIT failed. Free the existing one and
* prealloc for @blkg w/ GFP_KERNEL.
*/

Why this check is not done before pd_alloc_fn()? What if pd_alloc_fn() succeed for
removed blkg?

I will fix this memory leak issue in the next revision.

Thank,

Qixing