Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

From: Linus Torvalds
Date: Thu Apr 02 2020 - 19:47:51 EST


On Thu, Apr 2, 2020 at 4:04 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>
> That is not the direction I intend to take either.

Ahh, good. Because those kinds of "play games with dropping locks in
the middle based on conditionals" really have been horrible.

Yes, we've done it, and it's almost always been asource of truly subtle bugs.

> The exec_update_mutex is to be a strict subset of today's
> cred_guard_mutex. I tried to copy cred_guard_mutex's unlock style so
> that was obvious and that turns out was messier than I intended.

Yes. That is why I had no problem pulling that subset, and my worries
were mainly about the explanations and that flag use.

> Since I thought I already knew what was in the patches and the worst
> problem was the missing unlock of cred_guard_mutex, and I know Bernd's
> patches had been tested I applied them. I missed that Bernd had added
> the exec_mmap_called flag into my patch. I thought he had only added
> the missing unlock.

Ahh, so you meant for all of that to be entirely static refactoring,
rather than the conditional unlock depending on just how far we had
gotten.

Good, that's generally the much superior approach.

I absolutely _hate_ the "drop and retake" model, unless it's a very
local case with a very explicit retry path.

In contrast, the "we have a flag that shows how far we've gotten"
_has_ been a successful model, and while I much prefer a static "lock
pairs with unlock", that "I have done this, so you need to unlock" is
not entirely out of the question when the static rules become too
complex to think about.

The vfs code has something similar in FMODE_OPENED which is basically
a flag saying "I actually made it all the way to the ->open()"
callback. We used to have a static model, but the rules for when we
can use fput(), and when we have to use fdrop() were too hard for
people.

> I spotted the weirdness in unlocking exec_update_mutex, and because it
> does fix a real world deadlock with ptrace I did not back it out from my
> tree.
>
> I have been much laxer on the details than I like to be my apologies.

Ok, as long as we have a sane plan..

And

> The plan is:
> exec_udpate_mutex will cover just the subset of cred_guard_mutex
> after the point of no return, and after we do any actions that
> might block waiting for userspace to do anything.
>
> So exec_update_mutex will just cover those things that exec
> is updating, so if you want an atomic snapshot of them
> plus the appropriate struct cred you can grab exec_update_mutex.
>
> I added a new mutex instead of just fixing cred_guard_mutex so
> that we can update or revert the individual code paths if it
> is found that something is wrong.
>
> The cred_guard_mutex also prevents other tasks from starting
> to ptrace the task that is exec'ing, and other tasks from
> setting no_new_privs on the task that is exec'ing.
>
> My plan is to carefully refactor the code so it can perform
> both the ptrace and no_new_privs checks after the point of
> no return.

Ok. Sounds good.

> I have uncovered all kinds of surprises while working in that direction
> and I realize it is going to take a very delicate and careful touch to
> achieve my goal.
>
> There are silly things like normal linux exec when you are ptraced and
> exec changes the credentials the ordinary code will continue with the
> old credentials, but the an LSM enabled your process is likely to be
> killed instead.

Yeah. The "continue with old credentials" is actually very traditional
and the original behavior, and is useful for handling the case of
debugging something that is suid, but doesn't necessarily _require_
it.

But the LSM's just say yes/no.

I have this dim memory that it also triggers when you do the debugging
as root, but that may be some medication-induced memory.

> There is the personal mind blowing scenario where selinux will increase
> your credentials upon exec but if a magic directive is supplied in it's
> rules will avoid setting AT_SECURE, so that userspace won't protect
> itself from hostile takeover from the pre credential change environment.
> Much to my surprise "noatsecure" is a known and documented feature of
> selinux. I am not certain but I think I even spotted it in use on
> production.

We have had a _ton_ of random small rules so that people could enable
SElinux in legacy environments.

They are _probably_ effectively dead code in this day and age, but
it's hard to tell...

Linus