Re: [PATCH] prctl: remove one-shot limitation for changing exe link

From: Stanislav Kinsburskiy
Date: Wed Aug 10 2016 - 23:39:51 EST




30.07.2016 19:31, Eric W. Biederman пишет:
Cyrill Gorcunov <gorcunov@xxxxxxxxx> writes:

On Mon, Jul 25, 2016 at 02:56:43PM -0500, Eric W. Biederman wrote:
...
Also there is a big fat bug in prctl_set_mm_exe_file. It doesn't
validate that the new file is a actually mmaped executable. We would
definitely need that to be fixed before even considering removing the
limit.
Could you please elaborate? We check for inode being executable,
what else needed?
That the inode is mmaped into the process with executable mappings.

Effectively what we check the old mapping for and refuse to remove the old
mm_exe_file if it exists.

I think a reasonable argument can be made that if the file is
executable, and it is mmaped with executable pages that exe_file is not
a complete lie.
I might be missing something obvious, so sorry for the question --
when criu setups old exe link the inode we obtain from file open
is not mapped into memory, the old exe not read by anyone because
it's not even executed anyhow. So I don't really understand which
mapping we should check here. Mind to point me?
That sounds like an out and out bug that should not be preserved.
Of course we should mmap the executable and set it up so that it can be
executed (at least as much as the executable was previously mapped).
Anything else is a buggy restart, and lying to userspace.

Which is the important part. At the end of the day how much can
userspace trust /proc/pid/exe? If we are too lax it is just a random
file descriptor we can not trust at all. At which point there is
exactly no point in preserving it in checkpoint/restart, because nothing
will trust or look at it.
You know, I think we should not trust exe link much, and in real we
never could: this link is rather a hint pointing which executable a
process has been using on execve call, once the process start working
one can't be sure if the code currently running is exactly from the
file pointed by exe link. It just a hint suitable for debuggin and
obtain clean view of which processes are running on noncompromised
system. Monitoring exe link change won't help much if there are
malicious software running on the system.
But it is not just a hint. It is a record of which executable we called
execve on. Knowing which file was executed doesn't guarantee what is
running now but it provides a very strong hint.

At then end of a restart the state of a process should be (by
definition) exactly the state the process was before a checkpoint
and thus a state the original executable could have gotten into.

I admit it is possible for an application to unmap itself. I honestly
have not met that application (except perhaps criu).

If the only user is checkpoint/restart perhaps it should be only ptrace
that can set this and not the process itself with a prctl. I don't
know. All I know is that we should work on making it a very trustable
value even though in some specific instances we can set it.
Since as I said I suppose nobody except us using this feature, we can
setup some sysctl trigger for it (I personally think this is an
overkill, but OTOH if people rely on the exe link and not going
to use criu at all, this trigger will help).
Some clarity of thought came to me, and I apologise for not replying
sooner with it sooner.

My problem with the original patch submission is that it was
justifying changing prctl_set_mm_exe_file based on what
prctl_set_mm_exe_file does today. As prctl_set_mm_exe_file was added
for the checkpoint/restart case that is justifying changing code based
on a buggy implementation.

It is necessary to look at the ordinary situation. Without
prctl_set_mm_exe /proc/[pid]/exe can be counted on as a record
of which executable was last passed to execve. Furthermore the state of
a process can be counted on to be a state reachable from calling
execve on /proc/[pid]/exe.

Which means to preserve those expectations prctl_set_mm_exe_file should
in practice just be a nicer less cumbersome interface to things you can
already achieve with execve.

Justifying removale of the one-short nature for prctl_set_mm_exe_file
is as straight forward as noting that a process can call execve on
any executable file.

However when I compare the invariants that execve has on a file (such as
the executable being mmaped) I see some noticable disparities between
what prctl_set_mm_exe_file allows and what execve allows. With
prctl_set_mm_exe being less strict.

So what I am requesting is very simple. That the checks in
prctl_set_mm_exe_file be tightened up to more closely approach what
execve requires. Thus preserving the value of the /proc/[pid]/exe for
the applications that want to use the exe link.

Eric, could you elaborate on the checks, you mentioned?
I don't see any significant checks, which are applicable to
prctl_set_mm_exe_file.
Say, there is a check for PF_NPROC_EXCEEDED in execve, but this is not applicable.
Some of the checks are related to file open, but this is not done in prctl_set_mm_exe_file.
There is some logic, related to "close on exec", but this is something which has to be avoided.
Would be nice to have an example of a check, which is missing.

Once the checks in prctl_set_mm_exe_file are tightened up please feel
free to remove the one shot test.

Eric