Re: [PATCH 0/2] namespaces: log namespaces per task

From: Nicolas Dichtel
Date: Wed May 07 2014 - 05:35:17 EST


Le 06/05/2014 23:15, Richard Guy Briggs a écrit :
On 14/05/05, Nicolas Dichtel wrote:
Le 02/05/2014 16:28, Richard Guy Briggs a ?crit :
On 14/05/02, Serge E. Hallyn wrote:
Quoting Richard Guy Briggs (rgb@xxxxxxxxxx):
I saw no replies to my questions when I replied a year after Aris' posting, so
I don't know if it was ignored or got lost in stale threads:
https://www.redhat.com/archives/linux-audit/2013-March/msg00020.html
https://www.redhat.com/archives/linux-audit/2013-March/msg00033.html
(https://lists.linux-foundation.org/pipermail/containers/2013-March/032063.html)
https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html

I've tried to answer a number of questions that were raised in that thread.

The goal is not quite identical to Aris' patchset.

The purpose is to track namespaces in use by logged processes from the
perspective of init_*_ns. The first patch defines a function to list them.
The second patch provides an example of usage for audit_log_task_info() which
is used by syscall audits, among others. audit_log_task() and
audit_common_recv_message() would be other potential use cases.

Use a serial number per namespace (unique across one boot of one kernel)
instead of the inode number (which is claimed to have had the right to change
reserved and is not necessarily unique if there is more than one proc fs). It
could be argued that the inode numbers have now become a defacto interface and
can't change now, but I'm proposing this approach to see if this helps address
some of the objections to the earlier patchset.

There could also have messages added to track the creation and the destruction
of namespaces, listing the parent for hierarchical namespaces such as pidns,
userns, and listing other ids for non-hierarchical namespaces, as well as other
information to help identify a namespace.

There has been some progress made for audit in net namespaces and pid
namespaces since this previous thread. net namespaces are now served as peers
by one auditd in the init_net namespace with processes in a non-init_net
namespace being able to write records if they are in the init_user_ns and have
CAP_AUDIT_WRITE. Processes in a non-init_pid_ns can now similarly write
records. As for CAP_AUDIT_READ, I just posted a patchset to check capabilities
of userspace processes that try to join netlink broadcast groups.


Questions:
Is there a way to link serial numbers of namespaces involved in migration of a
container to another kernel? (I had a brief look at CRIU.) Is there a unique
identifier for each running instance of a kernel? Or at least some identifier
within the container migration realm?

Eric Biederman has always been adamantly opposed to adding new namespaces
of namespaces, so the fact that you're asking this question concerns me.

I have seen that position and I don't fully understand the justification
for it other than added complexity.
Just FYI, have you seen this thread:
http://thread.gmane.org/gmane.linux.network/286572/

There is some explanations/examples about this topic.

Thanks for that reference. I read it through, but will need to do so
again to get it to sink in.

I think audit has the same problematic than x-netns netdevice: beeing able to identify a peer netns, when a userland apps "read" a message from the kernel.

The main problem with file descriptor is that you cannot use them when you
broadcast a message from kernel to userland.

Maybe we can use the local names concept (like file descriptors but without
their constraints), ie having an identifier of a peer (net)ns which is only
valid the current (net)ns. When the kernel needs to identify a peer (net)ns, it
uses this identifier (or allocate it the first time). After that, the userland
apps may reuse this identifier to configure things in the peer (net)ns.

Eric, any thoughts about this?

Regards,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/