Re: [perf stat] Extend --cpu to non-system-wide runs too? was Re: [PATCH v3] perf bench sched pipe: Add -G/--cgroups option

From: Ingo Molnar
Date: Wed Oct 18 2023 - 08:07:55 EST



* Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:

> Em Tue, Oct 17, 2023 at 02:43:45PM +0200, Ingo Molnar escreveu:
> > * Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> > > Em Tue, Oct 17, 2023 at 01:40:07PM +0200, Ingo Molnar escreveu:
> > > > Side note: it might make sense to add a sane cpumask/affinity setting
> > > > option to perf stat itself:
>
> > > > perf stat --cpumask
>
> > > > ... or so?
>
> > > > We do have -C:
>
> > > > -C, --cpu <cpu> list of cpus to monitor in system-wide
>
> > > > ... but that's limited to --all-cpus, right?
>
> > > > Perhaps we could extend --cpu to non-system-wide runs too?
>
> > > Maybe I misunderstood your question, but its a list of cpus to limit the
> > > counting:
>
> > Ok.
>
> > So I thought that "--cpumask mask/list/etc" should simply do what 'taskset'
> > is doing: using the sched_setaffinity() syscall to make the current
> > workload and all its children.
>
> > There's impact on perf stat itself: it could just call sched_setaffinity()
> > early on, and not bother about it?
>
> > Having it built-in into perf would simply make it easier to not forget
> > running 'taskset'. :-)
>
> Would that be the only advantage?

1)

Another advantage would be that perf stat could itself bind itself to the
inverse affinity mask.

This means the workload that is being executed is disturbed by perf as
little as possible.

That's not possible with 'taskset'.

2)

Plus taskset's syntax is arguably silly: why does it need a separate -c
option for a CPU list, why doesn't it figure it out by itself when there's
a comma or dash in the mask string?

A better syntax that perf could use would be to interpret it as a CPU mask
only when presented with a '0x' or '0b' prefix for a binary mask which is
IMO much more logical if we are talking masks. For example, to run on 8
full cores, using the '0b' GCC extension to specify binary literals:

perf stat --cpus 0b101010101010101

'taskset' has other syntax idiosyncracies, such as the weird inverted
argument order of PID and CPU list:

kepler:~/tip> taskset -p $$ 0b101010101010101
taskset: invalid PID argument: '0b101010101010101'

# .... erm, what??
# .... oh, taskset expects the PID argument last:

kepler:~/tip> taskset -p 0b101010101010101 $$
pid 195878's current affinity mask: ffffffffffffffff
pid 195878's new affinity mask: b101010101010101

# ... damn: taskset doesn't know the 0b prefix and blindly assumed it's
# hexadecimal ...

So I'd love it if perf stat grew a sane CPU-affinity option.

3)

As a bonus perf stat could later also grow matching memory-node-affinity
features that 'taskset' doesn't have ...

Anyway, that's my attempt to convince you guys that it's a good idea to
have this feature. :-)

Thanks,

Ingo