Re: [PATCH] ptrace: restore smp_rmb() in __ptrace_may_access()
From: Jann Horn
Date: Fri May 31 2019 - 11:08:24 EST
On Thu, May 30, 2019 at 3:42 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> Jann Horn <jannh@xxxxxxxxxx> writes:
>
> > I'm actually trying to get rid of the ->mm access in
> > __ptrace_may_access() entirely by moving the dumpability and the
> > user_ns into the signal_struct, but I don't have patches for that
> > ready (yet).
>
> Do you have a plan for dealing with old linux-threads style threads
> where you have two processes that share the same mm, but have different
> signal_structs.
Oh, I hadn't realized that linux-threads exists and uses this feature...
> I don't think it is required to share any other structures except
> mm_struct when you share mm_struct. Maybe sighand_struct.
>
> Not to derail your idea. Only needing to look at signal_struct sounds
> very nice. I just know we have some other somewhat bizarre cases the
> kernel still supports.
My line of thought was that conceptually, dumpability doesn't have
much to do with the mm_struct. Dumpability has the following purposes,
as far as I know:
1. Prevent the creation of core dumps with elevated privileges in
attacker-specified locations (if the core_pattern is a relative path).
This could happen in the following scenarios:
1a) setuid executable crashes with elevated privileges (dumpability is
reduced in setup_new_exec() on privileged execve() for this reason)
1b) a privileged process is deliberately sharing its fs_struct with an
attacker-controlled one
2. Prevent reading the memory of processes that are running
execute-only binaries [*] (dumpability is reduced in would_dump() on
execve() for this reason)
3. Prevent ptrace-attaching (and similar forms of access) against
formerly-privileged processes that have dropped UID/GID/caps
privileges, but still have some other form of privilege left (e.g.
file descriptors). Similarly, prevent reading process memory after a
privilege transition by triggering a core dump.
For numbers 1a and 2, it doesn't matter on which level the flag is -
during execve, the signal_struct and the new mm_struct are both not
shared, so the effect is the same.
For number 1b, you're probably not sharing the mm_struct, so either
way the privileged process needs to mark itself as nondumpable or
already be nondumpable for some reason. (I think I know a single
example where this actually happens, and that one's a setuid helper,
so it's nondumpable from the start anyway.)
For number 3, when the kernel automatically marks a process as
nondumpable during commit_creds(), I don't think it matters for
security on which level that change happens, whether it's on the task
level, the signal_struct level, or the mm_struct level, since you can
only attach to a task that has itself gone through the
privilege-dropping process - in other words, as long as the scope of
the dumpability flag includes the scope of the credentials (which is
per-task), it should be fine. I think the behavior actually makes more
sense here if it happens on the signal_struct level - for number 3, if
one process with shared mm drops privileges, that is irrelevant for
other processes sharing the mm, since they remain inaccessible until
they also go through a similar credential change.
For manual control of the dumpability through PR_SET_DUMPABLE, the
prctl(2) manpage says that dumpability is a property of "the process",
which is the same wording that is also used for per-task properties
and at least one per-thread-group property in there; so I was hoping
that I could get away with just fudging the semantics so that
PR_SET_DUMPABLE only affects the thread group. In case someone tells
me that I can't do that because it would break something, my backup
plan was to do something really ugly in PR_SET_DUMPABLE, similar to
what zap_threads() and __set_oom_adj() do, like this:
if (refcount_read(current->signal->sigcnt) != 1) {
for_each_process(p) {
if (READ_ONCE(p->mm) != current->mm)
continue;
task_lock(p);
if (p->mm == current->mm)
WRITE_ONCE(p->signal->dumpable, dumpable);
task_unlock(p);
}
}
But I'd really like to avoid that, because it makes the code messier,
slower, and in my opinion, less logical.
The reason why I want to make this change is that I think the current
fail-open behavior of __ptrace_may_access() for a process whose
mm_struct has gone away is dangerous; but I also don't want to just
make it fail-closed. So I have to shove the dumpability information
somewhere else (or muck around with the mm_struct's lifetime, but I'd
like to avoid that).
[*]: That doesn't really work though, does it? I don't think anything
prevents running an execute-only program in a task that already has a
ptracer attached to it. And for dynamically-linked binaries, I think
you can probably still create a new user namespace, do chroot() in
there, and then the kernel will resolve the absolute path to the ELF
loader inside the chroot(), and then your own fake ELF loader can dump
the binary. And I don't think the AT_SECURE flag gets set, so you can
use LD_PRELOAD or whatever.