Re: [PATCH] mm: don't allow empty relative nodemask in mpol_relative_nodemask()
From: David Hildenbrand (Arm)
Date: Fri Jun 05 2026 - 11:31:59 EST
On 6/2/26 17:01, Farhad Alemi wrote:
> Confirmed, with a standalone reproducer (attached); it panics linus/master
> at e8c2f9fdadee. cs->mems_allowed can legitimately be empty
> on v2 -- a freshly created cpuset child that never had cpuset.mems
> written keeps mems_allowed empty (never initialized) while effective_mems
> is inherited non-empty in cpuset_css_online(), and v2 allows attaching
> tasks to it (the empty-mems guard in cpuset_can_attach_check() is gated
> on !is_in_v2_mode()). So the non-empty guarantee holds for effective_mems,
> not for the configured cs->mems_allowed; forbidding empty cpuset.mems
> would break v2's inherit-from-parent semantics.
>
> The reproducer enables +cpuset, mkdirs a child without writing
> cpuset.mems, moves a task in, mbind()s a VMA with
> MPOL_BIND | MPOL_F_RELATIVE_NODES, and offlines a CPU; the hotplug walk
> then calls mpol_rebind_mm(mm, &cs->mems_allowed) with the empty mask and
> folds modulo nodes_weight(*rel) == 0 (console logs attached).
>
> The newmems instinct looks right: it's the effective, online mask the
> task is actually allowed to use, guarantee_online_mems() keeps it
> non-empty, and it matches cpuset_attach(), which already rebinds against
> cs->effective_mems. The fix this implies:
>
> - mpol_rebind_mm(mm, &cs->mems_allowed);
> + mpol_rebind_mm(mm, &newmems);
>
> I built the current base (e8c2f9fdadee) with and without this one-liner:
> the unpatched kernel panics on the first cpu1 offline, while the patched
> kernel runs the reproducer's 8 offline/online cycles cleanly, with no
> divide error.
>
> This regressed in ae1c802382f7 ("cpuset: apply cs->effective_{cpus,mems}",
> v3.17), which moved cpuset_attach() to the effective mask but left this
> rebind on cs->mems_allowed.
>
> Happy to send this as a proper patch (Fixes: ae1c802382f7, Cc: stable,
> reproducer) if you agree the cpuset side is right, or to test a
> mempolicy-side fix if not.
Yes, please send a patch, including a high-level explanation of what you
analyzed above!
--
Cheers,
David