Re: [RFC PATCH 0/2] perf_events: add support for per-cpuper-cgroup monitoring (v3)

From: Peter Zijlstra
Date: Tue Sep 21 2010 - 05:38:49 EST


On Thu, 2010-09-09 at 15:05 +0200, Stephane Eranian wrote:
> The cgroup to monitor is designated by passing a file descriptor opened
> on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The
> option must be activated by setting perf_event_attr.cgroup=1 and passing
> a valid file descriptor in perf_event_attr.cgroup_fd. Those are the only
> two ABI extensions.

> +++ b/include/linux/perf_event.h
> @@ -215,8 +215,9 @@ struct perf_event_attr {
> */
> precise_ip : 2, /* skid constraint */
> mmap_data : 1, /* non-exec mmap data */
> + cgroup : 1, /* cgroup aggregation */
>
> - __reserved_1 : 46;
> + __reserved_1 : 45;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> @@ -226,6 +227,8 @@ struct perf_event_attr {
> __u32 bp_type;
> __u64 bp_addr;
> __u64 bp_len;
> +
> + int cgroup_fd;
> };
>
> /*

I'm not sure I like this much.. so we attach to {pid,cpu}, for nodes we
can use cpu_to_node(cpu), which would suggest to use
cgroup_of_task(pid), except that a task can be part of multiple cgroups,
so its not unique.

One thing we could do is pass this cgroup identifier in the pid field
and use PERF_FLAG_CGROUP or something. Currently the syscall signature
uses pid_t, but I think we can safely change that to int.

You create a special new file in the cgroup stuff, I'm not sure about
that either, but its not something I feel too strongly about, why
wouldn't a fd of any file or even directory of that cgroup work? Do the
cgroup people have an opinion?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/