Re: [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring
From: Lin Ming
Date: Wed Sep 01 2010 - 23:53:27 EST
On Tue, Aug 31, 2010 at 11:25 PM, Stephane Eranian <eranian@xxxxxxxxxx> wrote:
> This series of patches adds per-container (cgroup) filtering capability
> to per-cpu monitoring. In other words, we can monitor all threads belonging
> to a specific cgroup and running on a specific CPU.
>
> This is useful to measure what is going on inside a cgroup. Something that
> cannot easily and cheaply be achieved with either per-thread or per-cpu mode.
> Cgroups can span multiple CPUs. CPUs can be shared between cgroups. Cgroups
> can have lots of threads. Threads can come and go during a measurement.
>
> To measure per-cgroup today requires using per-thread mode and attaching to
> all the current threads inside a cgroup and tracking new threads. That would
> require scanning of /proc/PID, which is subject to race conditions, and
> creating an event for each thread, each event requiring kernel memory.
>
> The approach taken by this patch is to leverage the per-cpu mode by simply
> adding a filtering capability on context switch only when necessary. That
> way the amount of kernel memory used remains bound by the number of CPUs.
> We also do not have to scan /proc. We are only interested in cgroup level
> counts, not per-thread.
>
> The cgroup to monitor is designated by passing a file descriptor opened
> on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The
> option must be activated by setting perf_event_attr.cgroup=1 and passing
> a valid file descriptor in perf_event_attr.cgroup_fd. Those are the only
> two ABI extensions.
>
> The patch also includes changes to the perf tool to make use of cgroup
> filtering. Both perf stat and perf record have been extended to support
> cgroup via a new -G option. The cgroup is specified per event:
>
> $ perf stat -a -e cycles:u,cycles:u -G test1,test2 -- sleep 1
> Performance counter stats for 'sleep 1':
> 2368881622 cycles test1
> 0 cycles test2
> 1.001938136 seconds time elapsed
I have tried this new feature. Cool!
perf stat [<options>] [<command>]
Is the command ("sleep 1" in above example) also counted?
Thanks,
Lin Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/