Re: [PATCH] prctl: require checkpoint_restore_ns_capable for PR_SET_MM_MAP

From: David Hildenbrand (Arm)

Date: Thu Apr 02 2026 - 09:38:05 EST


On 4/2/26 13:13, Qi Tang wrote:
> prctl_set_mm_map() allows modifying all mm_struct boundaries and
> the saved auxv vector. The individual field path (PR_SET_MM_START_CODE
> etc.) correctly requires CAP_SYS_RESOURCE, but the PR_SET_MM_MAP path
> dispatches before this check and has no capability requirement of its
> own when exe_fd is -1.
>
> This means any unprivileged user on a CONFIG_CHECKPOINT_RESTORE kernel
> (nearly all distros) can rewrite mm boundaries including start_brk, brk,
> arg_start/end, env_start/end and saved_auxv. Consequences include:
>
> - SELinux PROCESS__EXECHEAP bypass via start_brk manipulation
> - procfs info disclosure by pointing arg/env ranges at other memory
> - auxv poisoning (AT_SYSINFO_EHDR, AT_BASE, AT_ENTRY)
>
> The original commit f606b77f1a9e ("prctl: PR_SET_MM -- introduce
> PR_SET_MM_MAP operation") states "we require the caller to be at least
> user-namespace root user", but this was never enforced in the code.

That is taken out of contex, no?

"Still note that updating exe-file link now doesn't require sys-resource
capability anymore, ... Still we require the caller to be at least
user-namespace root user."

That check was added in prctl_set_mm_map()->validate_prctl_map() in the
original patch:

+ /*
+ * Finally, make sure the caller has the rights to
+ * change /proc/pid/exe link: only local root should
+ * be allowed to.
+ */
+ if (prctl_map->exe_fd != (u32)-1) {
+ struct user_namespace *ns = current_user_ns();
+ const struct cred *cred = current_cred();
+
+ if (!uid_eq(cred->uid, make_kuid(ns, 0)) ||
+ !gid_eq(cred->gid, make_kgid(ns, 0)))
+ goto out;
+ }


--
Cheers,

David