Re: [RFCv2 4/4] perf: util: support sysfs supported_cpumask file
From: Mark Rutland
Date: Thu Jul 21 2016 - 05:49:51 EST
On Thu, Jul 21, 2016 at 10:10:35AM +0200, Jiri Olsa wrote:
> On Mon, Jul 18, 2016 at 04:00:45PM +0100, Mark Rutland wrote:
> > On Mon, Jul 18, 2016 at 04:30:18PM +0200, Jiri Olsa wrote:
> > > On Fri, Jul 15, 2016 at 11:08:13AM +0100, Mark Rutland wrote:
> > > > For system PMUs, the perf tools have long expected a cpumask file under
> > > > sysfs, describing the single CPU which they support events being
> > >
> > > single cpu? it's cpumask..
> >
> > Indeed.
> >
> > The issue is that in practice, due to an internal inconsistency the
> > perf tools only work work when a single CPU is described in the mask.
> > More details below (and in patch 1).
> >
> > > > opened/handled on. Prior patches in this series have reworked this
> > > > support to support multiple CPUs in a mask, as is required to handle
> > > > heterogeneous CPU PMUs.
> > > >
> > > > Unfortunately, adding a cpumask file to CPU PMUs would break existing
> > > > userspace. Prior to this series, perf record will refuse to open events,
> > >
> > > I'm lost.. we already have 'cpumask' file under pmu..
> >
> > Sorry, I should spell out the problem more concretely:
> >
> > When manipulating events, the tools sometimes use evsel->cpus, and other
> > times evlist->cpus. Sometimes, the two are used inconsistently, which
> > only works if they are the same size and/or describe the same CPUs.
> > Patch 1 fixes an instance of this, where the inconsistency results in
> > treating uninitialised memory as perf event FDs.
> >
> > In the absence of a PMU cpumask file, the evsel's cpumask is initialised
> > to that of the evlist, so things line up.
> >
> > Currently the only PMUs which happen to expose a cpumask are uncore
> > PMUs, which in practice only describe a single CPU.
> >
> > When recording system-wide, various parts of the perf tools assume a
> > single CPU, regardless of evlist->cpus, for the purpose of manipulating
> > events. This happens to make uncore PMUs work, avoiding the
> > inconsistency.
> >
> > Were we to just add a 'cpumask' file to our CPU PMUs, we would break
> > existing userspace (e.g. hitting the issue fixed in patch 1).
>
> so you're saying that perf is broken once pmu's cpumask
> contains more than single cpu, is that right?
Yes.
> we should fix that, not make workarounds.. I'll go check,
> I might be still missing something ;-)
I certainly agree that this should be fixed in the perf tool; hence
patches 1-3. ;)
The problem the workaround is trying to solve is kernel compatibility
with existing binaries, for which (prior to this series):
- perf record doesn't work by default in heterogeneous systems in the
*absence* of a cpumask.
- perf stat doesn't work by default in heterogeneous systems in the
*presence* of a cpumask.
The kernel doesn't *currently* expose a cpumask for the ARM CPU PMUs, so
we'd need to add one. While new userspace should work as of these
patches, I can't add a file called 'cpumask' kernel-side without
breaking existing perf existing binaries (in the case of perf stat).
If it's possible to solve this without exposing a cpumask file at all,
that would be ideal, but so far I haven't been able to make that work.
Any ideas welcome!
> would be great to have some automated test for this stuff
Good point. I will take a look into that.
Thanks,
Mark.
[1] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland@xxxxxxx