Re: [PATCH v6] perf: Sharing PMU counters across compatible events

From: Song Liu
Date: Tue Nov 05 2019 - 18:07:21 EST




> On Nov 5, 2019, at 12:16 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 05, 2019 at 05:11:08PM +0000, Song Liu wrote:
>
>>> I think we can use one of the event as master. We need to be careful when
>>> the master event is removed, but it should be doable. Let me try.
>>
>> Actually, there is a bigger issue when we use one event as the master: what
>> shall we do if the master event is not running? Say it is an cgroup event,
>> and the cgroup is not running on this cpu. An extra master (and all these
>> array hacks) help us get O(1) complexity in such scenario.
>>
>> Do you have suggestions on how to solve this problem? Maybe we can keep the
>> extra master, and try get rid of the double alloc?
>
> Right, you have to consider scope when sharing. The master should be the
> largest scope event and any slaves should be complete subsets.
>
> Without much thought this seems a fairly straight forward constraint;
> that is, given cgroups I'm not immediately seeing how we can violate
> that.
>
> Basically, pick the cgroup event nearest to the root as the master.
> We have to have logic to re-elect the master anyway for deletion, so
> changing it on add shouldn't be different.
>
> (obviously the root-cgroup is cpu-wide and always on, and if you have
> two events from disjoint subtrees they have no overlap, so it doesn't
> make sense to share anyway)

Hmm... I didn't think about cgroup structure with this much detail. And
this is very interesting idea.

OTOH, non-cgroup event could also be inactive. For example, when we have
to rotate events, we may schedule slave before master. And if the master
is in an event group, it will be more complicated...

Currently, we already have two separate scopes in sharing: one for cpu_ctx,
the other for task_ctx. I would like to enable as much sharing as possible
with in each ctx.

Let me double check whether we can make the code with extra master clearer,
namely, get rid of double alloc and the ugly array.

Thanks,
Song