Re: [PATCH v3 1/2] perf/core: Share an event with multiple cgroups

From: Namhyung Kim
Date: Sun May 09 2021 - 03:14:10 EST


Hi Peter,

Thinking about the interface a bit more...

On Fri, Apr 16, 2021 at 4:59 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Fri, Apr 16, 2021 at 08:22:38PM +0900, Namhyung Kim wrote:
> > On Fri, Apr 16, 2021 at 7:28 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, Apr 16, 2021 at 11:29:30AM +0200, Peter Zijlstra wrote:
> > >
> > > > > So I think we've had proposals for being able to close fds in the past;
> > > > > while preserving groups etc. We've always pushed back on that because of
> > > > > the resource limit issue. By having each counter be a filedesc we get a
> > > > > natural limit on the amount of resources you can consume. And in that
> > > > > respect, having to use 400k fds is things working as designed.
> > > > >
> > > > > Anyway, there might be a way around this..
> > >
> > > So how about we flip the whole thing sideways, instead of doing one
> > > event for multiple cgroups, do an event for multiple-cpus.
> > >
> > > Basically, allow:
> > >
> > > perf_event_open(.pid=fd, cpu=-1, .flag=PID_CGROUP);
> > >
> > > Which would have the kernel create nr_cpus events [the corrolary is that
> > > we'd probably also allow: (.pid=-1, cpu=-1) ].
> >
> > Do you mean it'd have separate perf_events per cpu internally?
> > From a cpu's perspective, there's nothing changed, right?
> > Then it will have the same performance problem as of now.
>
> Yes, but we'll not end up in ioctl() hell. The interface is sooo much
> better. The performance thing just means we need to think harder.

So I'd like to have vector support for cgroups but it could be
extended later. So open with a flag that it'd accept a vector

fd = perf_event_open(.pid=-1, .cpu=N, .flag=VECTOR);

Then it'd still need an additional interface (probably ioctl) to
set (or append) the vector.

ioctl(fd, ADD_VECTOR, { .type = VEC_CGROUP, .nr = N, ... });

Maybe we also need to add FORMAT_VECTOR and use read(v)
or friends to read the contents for each entry. It'd be nice
if it can have a vector-specific info like cgroup-id in this case.

What do you think?

Thanks,
Namhyung