Re: [PATCH v2 1/3] capabilities: Introduce CAP_CHECKPOINT_RESTORE

From: Cyrill Gorcunov
Date: Tue Jun 09 2020 - 14:45:27 EST


On Wed, Jun 03, 2020 at 06:23:26PM +0200, Adrian Reber wrote:
> This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
> checkpoint/restore for non-root users.
>
> Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has been
> asked numerous times if it is possible to checkpoint/restore a process as
> non-root. The answer usually was: 'almost'.
>
> The main blocker to restore a process as non-root was to control the PID of the
> restored process. This feature available via the clone3 system call, or via
> /proc/sys/kernel/ns_last_pid is unfortunately guarded by CAP_SYS_ADMIN.
...
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index d86c0afc8a85..ce02f3a4b2d7 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2189,16 +2189,16 @@ struct map_files_info {
> };
>
> /*
> - * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the
> - * symlinks may be used to bypass permissions on ancestor directories in the
> - * path to the file in question.
> + * Only allow CAP_SYS_ADMIN and CAP_CHECKPOINT_RESTORE to follow the links, due
> + * to concerns about how the symlinks may be used to bypass permissions on
> + * ancestor directories in the path to the file in question.
> */
> static const char *
> proc_map_files_get_link(struct dentry *dentry,
> struct inode *inode,
> struct delayed_call *done)
> {
> - if (!capable(CAP_SYS_ADMIN))
> + if (!(capable(CAP_SYS_ADMIN) || capable(CAP_CHECKPOINT_RESTORE)))
> return ERR_PTR(-EPERM);

First of all -- sorry for late reply. You know, looking into this code more
I think this CAP_SYS_ADMIN is simply wrong: for example I can't even fetch
links for /proc/self/map_files. Still /proc/$pid/maps (which as well points
to the files opened) test for ptrace-read permission. I think we need
ptrace-may-attach test here instead of these capabilities (if I can attach
to a process I can read any data needed, including the content of the
mapped files, if only I'm not missing something obvious).