Re: [patch V2 3/4] sched/mmcid: Drop per CPU CID immediately when switching to per task mode

From: Thomas Gleixner

Date: Tue Feb 10 2026 - 08:34:20 EST


On Tue, Feb 10 2026 at 11:51, Shinichiro Kawasaki wrote:
> On Feb 10, 2026 / 11:44, Thomas Gleixner wrote:
>> > [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500
>>
>> Can you please decode these symbols (file/line) so that we actually see
>> which access is flagged by KASAN?
>
> Sure, faddr2line points to the line the patch touched:
>
> $ ./scripts/faddr2line vmlinux sched_mm_cid_exit+0x298/0x500
> sched_mm_cid_exit+0x298/0x500:
> arch_clear_bit at arch/x86/include/asm/bitops.h:79
> (inlined by) clear_bit at include/asm-generic/bitops/instrumented-atomic.h:42
> (inlined by) mm_drop_cid at kernel/sched/sched.h:3746
> (inlined by) mm_drop_cid_on_cpu at kernel/sched/sched.h:3762
> (inlined by) sched_mm_cid_exit at kernel/sched/core.c:10737

Ok. That's useful and I think I know what's going on.

fork() switches to per CPU mode and sets the TRANSIT bit on the task and
the CPU.

While the task is out in user space and therefore not scheduling, other
tasks are exiting and when this task exits it hits the mode change.

It still has the transit bit set in both task::mm::mm_cid:cid and in the
per CPU cid store. sched_mm_cid_remove_user() clears the TRANSIT bit in
the task and drops the CID, but it does not touch the per CPU storage.

That's functionally correct because a CID is only owned by the CPU when
the ONCPU bit is set, which is mutually exclusive with the TRANSIT flag.

Now mm_drop_cid_on_cpu() assumes for the wrong reason that the CID is
CPU owned because the prior mode was per CPU. So it clears the (not set)
ONCPU bit and then invokes clear_bit() with an insanely large bit
number because TRANSIT is set (bit 29). Duh.

Can you please try the fix below?

Thanks

tglx
---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 854984967fe2..61c2d65156b5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10729,10 +10729,9 @@ void sched_mm_cid_exit(struct task_struct *t)
return;
/*
* Mode change. The task has the CID unset
- * already. The CPU CID is still valid and
- * does not have MM_CID_TRANSIT set as the
- * mode change has just taken effect under
- * mm::mm_cid::lock. Drop it.
+ * already and dealt with an eventually set
+ * TRANSIT bit. If the CID is owned by the CPU
+ * then drop it.
*/
mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu));
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index bd350e40859d..1b4283e9edc3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3758,8 +3758,10 @@ static __always_inline void mm_unset_cid_on_task(struct task_struct *t)
static __always_inline void mm_drop_cid_on_cpu(struct mm_struct *mm, struct mm_cid_pcpu *pcp)
{
/* Clear the ONCPU bit, but do not set UNSET in the per CPU storage */
- pcp->cid = cpu_cid_to_cid(pcp->cid);
- mm_drop_cid(mm, pcp->cid);
+ if (cid_on_cpu(pcp->cid)) {
+ pcp->cid = cpu_cid_to_cid(pcp->cid);
+ mm_drop_cid(mm, pcp->cid);
+ }
}

static inline unsigned int __mm_get_cid(struct mm_struct *mm, unsigned int max_cids)