Re: [PATCH] sysctl: add CAP_SYS_ADMIN check to panic/ctrl-alt-del sysctls

From: Dominik 'Disconnect3d' Czarnota

Date: Tue Jun 09 2026 - 08:34:07 EST

On Sat, Apr 18, 2026 at 05:14:06AM +1000, Aleksa Sarai wrote:
> More importantly, these kinds of check-permissions-at-write checks are
> actually quite weak because of setuid binaries -- you can always
> redirect the output of a setuid binary to a file to bypass these kinds
> of checks. A very classic example is mempodipper (CVE-2012-0056) but the
> recent "CrackArmor" bug (in particular CVE-2026-23268) is an even more
> obvious example. For those reasons, if an attacker can get access to an
> O_RDWR handle to a sysctl file, you're long since screwed.
>
> So, in my view this logic was somewhat flawed back then but makes
> absolutely no sense today and should not be propagated to other sysctls
> because it just gives a false sense of security.
>
> I also want to point out that the files you mentioned aren't even the
> juiciest targets -- kernel.core_pattern gives you free root code
> execution on the host if you can write to it.

Hi,

I was looking at this in the context of sethostname()/setdomainname(),
and procfs/sysctl seems inconsistent with the syscall interface there:
- the syscalls require:

ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN)

- /proc/sys/kernel/hostname and /proc/sys/kernel/domainname mutate the
same UTS namespace state, but only go through the generic procfs DAC
permission check.

So a process can fail the syscall after dropping effective
CAP_SYS_ADMIN, but still change the same value through sysctl if DAC
allows the write.

Is the "false sense of security" argument meant to apply to this direct
API consistency issue as well? The already-open-fd delegation case is a
valid concern, but it seems to show that write-time checks are
incomplete, not that the direct write path should be weaker than the
syscall path.

To summarize the current situation: no fd delegation or privileged helper
is needed. A task that passes procfs DAC can mutate UTS state
(hostname/domainname) directly after dropping CAP_SYS_ADMIN, while the
syscall path rejects the same operation.

If the already-open fd case is the concern, then the check should happen
when the file is opened for write, or should use file->f_cred so the
write is authorized against the opener's credentials. The kernel already
has examples in this direction: /proc/kcore rejects open without
CAP_SYS_RAWIO, /proc/<pid>/setgroups rejects open-for-write without
CAP_SYS_ADMIN in the target user namespace, and file_ns_capable() checks
the opener's credentials.

If those other files require capabilities, it seems to me that
kernel.core_pattern should as well. It controls global coredump
behaviour, including pipe helpers, and a pipe helper runs through the
usermodehelper path with root credentials and the default usermodehelper
capability set. Direct writes to this sysctl should therefore require
CAP_SYS_ADMIN rather than only procfs DAC permissions.

If you agree with this, should I send a patch for hostname/domainname
that makes its procfs writes (through proc_do_uts_string()) require the
same ns_capable() check, together with a selftest for this case?

Longer term, I think the sysctl documentation should state the policy
more explicitly. If writable /proc/sys is not meant to be a security
boundary, Documentation/admin-guide/sysctl/kernel.rst should say so
directly, so users know that environments exposing it writable (e.g.
containers) must treat it as privileged. Then individual entries should
document the non-obvious details: whether writes rely only on procfs DAC,
whether an additional capability or opener-credential check is done, and
whether the knob can have sensitive side effects such as core_pattern's
pipe helper execution. That would avoid having users infer the intended
security model from kernel behaviour and scattered examples.