Re: [PATCH v2 4/9] perf affinity: Add infrastructure to save/restore affinity

From: Andi Kleen
Date: Wed Oct 23 2019 - 18:37:08 EST


On Wed, Oct 23, 2019 at 09:08:47PM +0300, Alexey Budankov wrote:
> On 23.10.2019 20:19, Andi Kleen wrote:
> > On Wed, Oct 23, 2019 at 07:16:13PM +0300, Alexey Budankov wrote:
> >>
> >> On 23.10.2019 17:52, Andi Kleen wrote:
> >>> On Wed, Oct 23, 2019 at 04:30:49PM +0200, Jiri Olsa wrote:
> >>>> On Wed, Oct 23, 2019 at 06:02:35AM -0700, Andi Kleen wrote:
> >>>>> On Wed, Oct 23, 2019 at 11:59:11AM +0200, Jiri Olsa wrote:
> >>>>>> On Sun, Oct 20, 2019 at 10:51:57AM -0700, Andi Kleen wrote:
> >>>>>>
> >>>>>> SNIP
> >>>>>>
> >>>>>>> +}
> >>>>>>> diff --git a/tools/perf/util/affinity.h b/tools/perf/util/affinity.h
> >>>>>>> new file mode 100644
> >>>>>>> index 000000000000..e56148607e33
> >>>>>>> --- /dev/null
> >>>>>>> +++ b/tools/perf/util/affinity.h
> >>>>>>> @@ -0,0 +1,15 @@
> >>>>>>> +// SPDX-License-Identifier: GPL-2.0
> >>>>>>> +#ifndef AFFINITY_H
> >>>>>>> +#define AFFINITY_H 1
> >>>>>>> +
> >>>>>>> +struct affinity {
> >>>>>>> + unsigned char *orig_cpus;
> >>>>>>> + unsigned char *sched_cpus;
> >>>>>>
> >>>>>> why not use cpu_set_t directly?
> >>>>>
> >>>>> Because it's too small in glibc (only 1024 CPUs) and perf already
> >>>>> supports more.
> >>>>
> >>>> nice, we're using it all over the place.. how about using bitmap_alloc?
> >>>
> >>> Okay.
> >>>
> >>> The other places is mainly perf record from Alexey's recent affinity changes.
> >>> These probably need to be fixed.
> >>>
> >>> +Alexey
> >>
> >> Despite the issue indeed looks generic for stat and record modes,
> >> have you already observed record startup overhead somewhere in your setups?
> >> I would, first, prefer to reproduce the overhead, to have stable use case
> >> for evaluation and then, possibly, improvement.
> >
> > What I meant the cpu_set usages you added in
> >
> > commit 9d2ed64587c045304efe8872b0258c30803d370c
> > Author: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx>
> > Date: Tue Jan 22 20:47:43 2019 +0300
> >
> > perf record: Allocate affinity masks
> >
> > need to be fixed to allocate dynamically, or at least use MAX_NR_CPUs to
> > support systems with >1024CPUs. That's an independent functionality
> > problem.
>
> Oh, it is clear now. Thanks for pointing this out. For that to move from
> cpu_mask_t to new custom struct affinity type its API requires extension
> to provide mask operations similar to the ones that cpu_mask_t provides:
> CPU_ZERO(), CPU_SET(), CPU_EQUAL(), CPU_OR().
>
> For example it could be like: affinity__mask_zero(), affinity__mask_set(),
> affinity__mask_equal(), affinity__mask_or() and then the collecting part
> of record could also be moved to struct affinity type and overcome >1024CPUs
> limitation.

Not sure you need to use my library, except perhaps the get_cpu_set_size()
function. It is somewhat specialized.

Everything else you can use normal Linux bitmap functions,
or call the sys call directly.

-Andi