Re: [PATCH v2 1/3] blk-cgroup: fix race between policy activation and blkg destruction
From: Michal Koutný
Date: Thu Jan 15 2026 - 04:39:12 EST
On Thu, Jan 15, 2026 at 11:27:47AM +0800, Zheng Qixing <zhengqixing@xxxxxxxxxxxxxxx> wrote:
> Yes, this issue was discovered by injecting memory allocation failure at
> ->pd_alloc_fn(..., GFP_KERNEL) in blkcg_activate_policy().
Fair enough.
> Commit f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from
> blkg_free_workfn() and blkcg_deactivate_policy()") delays
> list_del_init(&blkg->q_node) until after pd_free_fn() in blkg_free_workfn().
IIUC, the point was to delay it from blkg_destroy until blkg_free_workfn
but then inside blkg_free_workfn it may have gone too far where it calls
pd_free_fn's before actual list removal.
(I'm Cc'ing the correct Kuai's address now.)
IOW, I'm wondering whether mere swap of these two actions (pd_free_fn
and list removal) wouldn't be a sufficient fix for the discovered issue
(instead of expanding lock coverage).
Thanks,
Michal
Attachment:
signature.asc
Description: PGP signature