Perf event operation with hotplug cpus and cgroups
From: William Cohen
Date: Fri Mar 20 2015 - 15:11:46 EST
The current perf event interface avoids complexity in the kernel by
making the user-space responsible for opening a file descriptor for
each cpu to monitor performance events. However, there are two use
cases where this approach has issues: handling system-wide
measurements with hotplug cpus and monitoring of cgroups.
hotplug cpus
hotplug cpus can dynamically change the number of cpus that are active
on the system. If "perf stat -a ..." is started with some of the
processors offline and then additional processors are put online after
perf is started no data is gathered from those newly onlined
processors.
cgroup monitoring
The cgroup monitoring is built on the perf event per cpu monitoring.
If the cgroup is not pinned to a particular set of processors, then
systemwide monitoring for that cgroup needs to be done and a perf
event open is needed for every cpu in the system. The issue with this
approach is if the cgroups are used for virtual machine guests where
each cgroup is allocated a single processor, the number of cgroups is
proportional to the number of processors in the machine. The number
of files that need to be opened to monitor the cgroups on the system
is O(cpus^2). For a large system with 80 cpus that would be 6400
files, much larger than the default ulimit settings and there are huge
number of syscalls to read out information. If one limits the number
of files opened for performance monitoring by pinning cgroups to
particular processors, any changes in pinning of cgroups to processors
will make the measurement incorrect.
Given the issues with these uses cases is user-space setting up the
counters for each cpu in the system the best solution? Would it be
better to to allow the system-wide data collection to selected with
one perf event open with pid==-1 and cpu==-1? Is setup of per cpu
monitoring and aggregation of the counters across processors too
difficult to do in the kernel?
-Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/