Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

From: Jann Horn
Date: Tue Apr 28 2020 - 17:06:32 EST


On Tue, Apr 28, 2020 at 10:36 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Apr 28, 2020 at 12:08 PM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > Oops. I can update that old patch but somehow I thought there is a better
> > plan which I don't yet understand...
>
> I don't think any plan survived reality.
>
> Unless we want to do something *really* hacky.. The attached patch is
> not meant to be serious.
>
> > And, IIRC, Jan had some ideas how to rework the new creds calculation in
> > execve paths to avoid the cred_guard_mutex deadlock?
>
> I'm not sure how you'd do that.
>
> Execve() fundamentally needs to serialize with PTRACE_ATTACH somehow,
> since the whole point is that "tsk->ptrace" changes how the
> credentials are interpreted.
>
> So PTRACE_ATTACH doesn't really _change_ the credentials, but it very
> much changes what execve() will do with them.
>
> But I guess we could do a "if somebody attached to us while we did the
> execve(), just repeat the whole thing"
>
> Jann, what was your clever idea? Maybe it got lost in the long thread..

My clever/horrible/overly-complex idea was basically:

In execve:

- After the point of no return, but before we start waiting for the
other threads to go away, finish calculating our post-execve creds
and stash them somewhere in the task_struct or so.
- Drop the cred_guard_mutex.
- Wait for the other threads to die.
- Take the cred_guard_mutex again.
- Clear out the pointer in the task_struct.
- Finish execve and install the new creds.
- Drop the cred_guard_mutex again.

Then in ptrace_may_access, after taking the cred_guard_mutex, we'd
know that the target task is either outside execve or in the middle of
execve, with old and new credentials known; and then we could say "you
only get to access that task if you're capable relative to *both* its
old and new credentials, since the task currently has both state from
the old executable and from the new one". (Other users that expect to
use cred_guard_mutex to synchronize with execve would also have to be
changed appropriately; e.g. seccomp tsync would have to bail out if
the task turns out to be in execve after the mutex has been acquired.)

So I think we can conceptually fix the deadlock, but it requires a bit
of refactoring. (I have an old branch somewhere in which I tried to
implement this, and where I did a bunch of refactoring around
ptrace_may_access() so that e.g. the LSM hooks for ptrace can be
invoked twice when the target task is in execve, and so that they take
the target's cred* as an argument.)