Re: [PATCH] perf: fix topology test on systems with sparse CPUs

From: Jiri Olsa
Date: Thu Feb 02 2017 - 08:01:23 EST


On Thu, Feb 02, 2017 at 07:06:43AM -0500, Jan Stancek wrote:
> >
> > > When build_cpu_topo() encounters offline/absent CPUs,
> > > it fails to find any sysfs entries and returns failure.
> > > This leads to build_cpu_topology() and write_cpu_topology()
> > > failing as well.
> > >
> > > Because HEADER_CPU_TOPOLOGY has not been written, read leaves
> > > cpu_topology_map NULL and we get NULL ptr deref at:
> > >
> > > ...
> > > cmd_test
> > > __cmd_test
> > > test_and_print
> > > run_test
> > > test_session_topology
> > > check_cpu_topology
> >
> > So IIUIC that's the key issue here.. write_cpu_topology that fails
> > to write the TOPO data and following readers crashing on processing
> > uncomplete data? if thats the case write_cpu_topology needs to
> > be fixed, instead of doing workarounds
>
> It's already late when you are in write_cpu_topology(), because
> build_cpu_topology() returned you NULL - there's nothing to write.
> That's why patch aims to fix this in build_cpu_topology().

ok, then we need to make sure we can't fail in write_cpu_topology

might be another patch scope though.. we can go with your fix so far

SNIP

>
> > For example:
> > _SC_NPROCESSORS_CONF == 16
> > available: 2 nodes (0-1)
> > node 0 cpus: 0 6 8 10 16 22 24 26
> > node 0 size: 12004 MB
> > node 0 free: 9470 MB
> > node 1 cpus: 1 7 9 11 23 25 27
> > node 1 size: 12093 MB
> > node 1 free: 9406 MB
> > node distances:
> > node 0 1
> > 0: 10 20
> > 1: 20 10
> > so what's max_present_cpu in this example?
>
> It's 28, which is the number of core_id/socket_id entries,
> for CPUs 0 up to 27.

ok, good

jirka