Re: [PATCH v7 07/12] perf/x86: no counter allocation support

From: Peter Zijlstra
Date: Tue Jul 09 2019 - 05:43:26 EST


On Tue, Jul 09, 2019 at 10:58:46AM +0800, Wei Wang wrote:
> On 07/08/2019 10:29 PM, Peter Zijlstra wrote:
>
> Thanks for the comments.
>
> >
> > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > > index 0ab99c7..19e6593 100644
> > > --- a/include/linux/perf_event.h
> > > +++ b/include/linux/perf_event.h
> > > @@ -528,6 +528,7 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
> > > */
> > > #define PERF_EV_CAP_SOFTWARE BIT(0)
> > > #define PERF_EV_CAP_READ_ACTIVE_PKG BIT(1)
> > > +#define PERF_EV_CAP_NO_COUNTER BIT(2)
> > > #define SWEVENT_HLIST_BITS 8
> > > #define SWEVENT_HLIST_SIZE (1 << SWEVENT_HLIST_BITS)
> > > @@ -895,6 +896,13 @@ extern int perf_event_refresh(struct perf_event *event, int refresh);
> > > extern void perf_event_update_userpage(struct perf_event *event);
> > > extern int perf_event_release_kernel(struct perf_event *event);
> > > extern struct perf_event *
> > > +perf_event_create(struct perf_event_attr *attr,
> > > + int cpu,
> > > + struct task_struct *task,
> > > + perf_overflow_handler_t overflow_handler,
> > > + void *context,
> > > + bool counter_assignment);
> > > +extern struct perf_event *
> > > perf_event_create_kernel_counter(struct perf_event_attr *attr,
> > > int cpu,
> > > struct task_struct *task,
> > Why the heck are you creating this wrapper nonsense?
>
> (please see early discussions: https://lkml.org/lkml/2018/9/20/868)
> I thought we agreed that the perf event created here don't need to consume
> an extra counter.

That's almost a year ago; I really can't remember that and you didn't
put any of that in your Changelog to help me remember.

(also please use: https://lkml.kernel.org/r/$msgid style links)

> In the previous version, we added a "no_counter" bit to perf_event_attr, and
> that will be exposed to user ABI, which seems not good.
> (https://lkml.org/lkml/2019/2/14/791)
> So we wrap a new kernel API above to support this.
>
> Do you have a different suggestion to do this?
> (exclude host/guest just clears the enable bit when on VM-exit/entry,
> still consumes the counter)

Just add an argument to perf_event_create_kernel_counter() ?