Capabilities are list when creating a user namespace

From: Idan Yadgar
Date: Sun May 24 2020 - 08:33:22 EST


Hello,

A process which changes its user namespace (unshare or setns), or a
process that is created by clone with the CLONE_NEWUSER flag has all
capabilities inside the new namespace, and loses all its capabilities
in the parent/previous user namespace.
This poses an issue because some operations require a capability in a
user namespace other then the current one for the process. The man
states multiple times that a system call requires a capability in the
initial user namespace (for example, open_by_handle_at requires
CAP_DAC_READ_SEARCH in the initial user namespace), but this cannot
happen unless the process is owned by root, thus preventing
open_by_handle_at to be run inside a user namespace.

Solving this problem can be done by allowing (via prctl or any other
mechanism) a task to save its
capabilities for a given user namespace, even when it isn't a member
in that namespace.

We would like to hear some thoughts about this issue and our proposed solution.