[patch 0/4] sched/mmcid: Cure fork()/vfork() related problems
From: Thomas Gleixner
Date: Tue Mar 10 2026 - 16:28:56 EST
Matthiue and Jiri reported CPU stalls where a CPU git stuck in mm_get_cid():
https://lore.kernel.org/b24ffcb3-09d5-4e48-9070-0b69bc654281@xxxxxxxxxx
After some tedious debugging it turned out to be another subtle (or not so
subtle) ownership mode change issue.
The logic handling vfork()'ed tasks in sched_mmcid_fixup_tasks_to_cpus() is
broken. It is invoked when the number of tasks associated to a process is
smaller than the number of MMCID users. It then walks the task list to find
the vfork()'ed task, but accounts all the already processed tasks as well.
If that double processing brings the number of to be handled tasks to 0,
the walk stops and the vfork()'ed task's CID is not fixed up. As a
consequence a subsequent schedule in fails to acquire a (transitional) CID
and the machine stalls.
Peter and me discovered also that there is a yet unreported issue
vs. concurrent forks. Jiri noticed it independently.
The following series fixes those issues. It applies on top of Linus tree.
Thanks a lot to Matthieu and Jiri for providing valuable debug
information and running the debug patches!
Thanks,
tglx
---
include/linux/rseq_types.h | 6 ++-
include/linux/sched.h | 2 -
kernel/fork.c | 3 -
kernel/sched/core.c | 79 +++++++++++++++------------------------------
4 files changed, 34 insertions(+), 56 deletions(-)