Re: [PATCH] proc: allow restricting /proc/pid/mem writes

From: Adrian Ratiu
Date: Mon Feb 26 2024 - 17:33:44 EST


Hello

On Monday, February 26, 2024 21:24 EET, Kees Cook <keescook@xxxxxxxxxxxx> wrote:

> [sorry for the duplicate, fixing Jann's email address]
>
> On Mon, Feb 26, 2024 at 09:10:54AM -0800, Doug Anderson wrote:
> > Hi,
> >
> > On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx> wrote:
> > >
> > > Prior to v2.6.39 write access to /proc/<pid>/mem was restricted,
> > > after which it got allowed in commit 198214a7ee50 ("proc: enable
> > > writing to /proc/pid/mem"). Famous last words from that patch:
> > > "no longer a security hazard". :)
> > >
> > > Afterwards exploits appeared started causing drama like [1]. The
> > > /proc/*/mem exploits can be rather sophisticated like [2] which
> > > installed an arbitrary payload from noexec storage into a running
> > > process then exec'd it, which itself could include an ELF loader
> > > to run arbitrary code off noexec storage.
> > >
> > > As part of hardening against these types of attacks, distrbutions
> > > can restrict /proc/*/mem to only allow writes when they makes sense,
> > > like in case of debuggers which have ptrace permissions, as they
> > > are able to access memory anyway via PTRACE_POKEDATA and friends.
> > >
> > > Dropping the mode bits disables write access for non-root users.
> > > Trying to `chmod` the paths back fails as the kernel rejects it.
> > >
> > > For users with CAP_DAC_OVERRIDE (usually just root) we have to
> > > disable the mem_write callback to avoid bypassing the mode bits.
> > >
> > > Writes can be used to bypass permissions on memory maps, even if a
> > > memory region is mapped r-x (as is a program's executable pages),
> > > the process can open its own /proc/self/mem file and write to the
> > > pages directly.
> > >
> > > Even if seccomp filters block mmap/mprotect calls with W|X perms,
> > > they often cannot block open calls as daemons want to read/write
> > > their own runtime state and seccomp filters cannot check file paths.
> > > Write calls also can't be blocked in general via seccomp.
> > >
> > > Since the mem file is part of the dynamic /proc/<pid>/ space, we
> > > can't run chmod once at boot to restrict it (and trying to react
> > > to every process and run chmod doesn't scale, and the kernel no
> > > longer allows chmod on any of these paths).
> > >
> > > SELinux could be used with a rule to cover all /proc/*/mem files,
> > > but even then having multiple ways to deny an attack is useful in
> > > case on layer fails.
> > >
> > > [1] https://lwn.net/Articles/476947/
> > > [2] https://issues.chromium.org/issues/40089045
> > >
> > > Based on an initial patch by Mike Frysinger <vapier@xxxxxxxxxxxx>.
> > >
> > > Cc: Guenter Roeck <groeck@xxxxxxxxxxxx>
> > > Cc: Doug Anderson <dianders@xxxxxxxxxxxx>
> > > Signed-off-by: Mike Frysinger <vapier@xxxxxxxxxxxx>
>
> This should have a "Co-developed-by: Mike..." tag, since you're making
> changes and not just passing it along directly.

Thanks, I'll address this in v2.

>
> > > Signed-off-by: Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx>
> > > ---
> > > Tested on next-20240220.
> > >
> > > I would really like to avoid depending on CONFIG_MEMCG which is
> > > required for the struct mm_stryct "owner" pointer.
> > >
> > > Any suggestions how check the ptrace owner without MEMCG?
> > > ---
> > > fs/proc/base.c | 26 ++++++++++++++++++++++++--
> > > security/Kconfig | 13 +++++++++++++
> > > 2 files changed, 37 insertions(+), 2 deletions(-)
> >
> > Thanks for posting this! This looks reasonable to me, but I'm nowhere
> > near an expert on this so I won't add a Reviewed-by tag.
> >
> > This feels like the kind of thing that Kees might be interested in
> > reviewing, so adding him to the "To" list.
>
> I'd love to make /proc/$pid/mem more strict. A few comments:
>
> > [...]
> > + if (ptracer_capable(current, mm->user_ns) &&
>
> It really looks like you're trying to do a form of ptrace_may_access(),
> but _without_ the introspection exception?
>
> Also, using "current" in the write path can lead to problems[1], so this
> should somehow use file->f_cred, or limit write access during the open()
> instead?

I think Mike explained pretty well why we need to check if current already
is a ptracer. The point you raise is valid (thanks again) so we need to check
a bit earlier, during open().

>
> > [...]
> > +config SECURITY_PROC_MEM_RESTRICT_WRITES
>
> Instead of a build-time CONFIG, I'd prefer a boot-time config (or a
> sysctl, but that's be harder given the perms). That this is selectable
> by distro users, etc, and they don't need to rebuild their kernel to
> benefit from it.

Ack, I'll implement a cmdline arg in v2.

>
> Jann Horn has tried to restrict access to this file in the past as well,
> so he may have some additional advice about it.

I'll leave this a few more days in case others have more ideas, then will
send v2 and also add Jann to the "To:" list.

>
> -Kees
>
> [1] https://docs.kernel.org/security/credentials.html#open-file-credentials
>
> --
> Kees Cook