Re: [patch V3 00/12] rseq: Implement time slice extension mechanism
From: Mathieu Desnoyers
Date: Wed Nov 12 2025 - 15:46:32 EST
On 2025-11-12 15:31, Thomas Gleixner wrote:
On Tue, Nov 11 2025 at 11:42, Mathieu Desnoyers wrote:
On 2025-11-10 09:23, Mathieu Desnoyers wrote:
I've spent some time digging through Thomas' implementation of
mm_cid management. I've spotted something which may explain
the watchdog panic. Here is the scenario:
1) A process is constrained to a subset of the possible CPUs,
and has enough threads to swap from per-thread to per-cpu mm_cid
mode. It runs happily in that per-cpu mode.
2) The number of allowed CPUs is increased for a process, thus invoking
mm_update_cpus_allowed. This switches the mode back to per-thread,
but delays invocation of mm_cid_work_fn to some point in the future,
in thread context, through irq_work + schedule_work.
At that point, because only __mm_update_max_cids was called by
mm_update_cpus_allowed, the max_cids is updated, but mc->transit
is still zero.
Also, until mm_cid_fixup_cpus_to_tasks is invoked by either the
scheduled work or near the end of sched_mm_cid_fork, or by
sched_mm_cid_exit, we are in a state where mm_cids are still
owned by CPUs, but we are now in per-thread mm_cid mode, which
means that the mc->max_cids value depends on the number of threads.
No. It stays in per CPU mode. The mode switch itself happens either in
the worker or on fork/exit whatever comes first.
Ah, that's what I missed. All good then.
[...]
There was an issue in V3 with the not-initialized transit member and a
off by one in one of the transition functions. It's fixed in the git
tree, but I haven't posted it yet because I was AFK for a week.
I did not notice the V3 issue because tests passed on a small machine,
but after I did a rebase to the tip rseq and uaccess bits, I noticed the
failure because I tested on a larger box.
Good ! We'll see if this fixes the issue observed by Prakash. If not,
I'm curious to validate that num_possible_cpus() is always set to its
final value before _any_ mm is created.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com