Re: [PATCH 0/3] perf: add support for analyzing events for containers

From: Hari Bathini
Date: Tue Nov 15 2016 - 07:21:32 EST




On Friday 11 November 2016 01:18 AM, Eric W. Biederman wrote:
Hari Bathini <hbathini@xxxxxxxxxxxxxxxxxx> writes:

Currently, there is no trivial mechanism to analyze events based on
containers. perf -G can be used, but it will not filter events for the
containers created after perf is invoked, making it difficult to assess/
analyze performance issues of multiple containers at once.

This patch-set overcomes this limitation by using cgroup identifier as
container unique identifier. A new PERF_RECORD_NAMESPACES event that
records namespaces related info is introduced, from which the cgroup
namespace's inode number is used as cgroup identifier. This is based
on the assumption that each container is created with it's own cgroup
namespace allowing assessment/analysis of multiple containers using
cgroup identifier.

The first patch introduces PERF_RECORD_NAMESPACES in kernel while the
second patch makes the corresponding changes in perf tool to read this
PERF_RECORD_NAMESPACES events. The third patch adds a cgroup identifier
column in perf report, which is nothing but the cgroup namespace's
inode number. This approach is based on the suggestion from Peter
Zijlstra here: https://patchwork.kernel.org/patch/9305655/
Where is the check that ensures that only the someone with
capable(CAP_SYS_ADMIN) can use this interface. This interface is not
namespace clean in multiple dimensions so it can not be used generally?

Right. Will add the check..

You are not allowed to move struct mount_namespace into
include/linux/mnt_namespace.h. Al Viro will crucify you with cause.
Those are implementation details the rest of the kernel should not be
digging into.

Ouch! How about adding an accessor function(s) in fs/namespace.c ..?

Where are the device numbers that go with those inode numbers you are
exporting? For now all of those inodes live on the filesystem but I am
not giving guarantees to userspace that do not work for ordinary
filesystems.

Sorry! I didn't get this..
Want to use these numbers as identity for namespace (like pid for process..)

Thanks
Hari