Re: [patch 07/11] x86/perf/uncore: Track packages not per cpu data
From: Stephane Eranian
Date: Thu Feb 18 2016 - 04:36:08 EST
On Thu, Feb 18, 2016 at 12:13 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Feb 17, 2016 at 11:16:40PM +0100, Andi Kleen wrote:
> > > Do you have any data to back that up or is that just "believe" ?
> >
> > I've seen systems with discontiguous apic ids before.
> >
> > It is obvious if you consider setups with node hotplug.
>
> Those systems will also have a num_possible_cpus() that's inflated to
> account for the possible hotplug of new sockets, right?
>
> So while the phys_pkg_id of the present sockets might be 2 and 3
> (leaving 0 and 1 available for hotplug), the num_possible_cpus() should
> be big enough to actually allow for that hotplug to happen.
>
> Because if num_possible_cpus() is too small, we could not accommodate
> the hotplug operation.
>
> And if num_possible_cpus() is of the right size, then the computed
> max_packages() should be of the right size too.
>
> Now clearly, BIOS can completely wreck things and indeed report too
> small an apic_id range or whatever, and in this case we're up a creek
> without a paddle.
>
> But I think you can check for that at boot and report errors/warns
> whatever, because if you trigger this, your system is not really
> 'correct' anyway.
>
The example I was worried about is as follows, take an Ivytown system
(IVT) with 12 physical cores per package on a 2 socket system.
Thomas's topology_max_packages() macro would return 2, for 2 packages
with CPU0..CPU47.
But let's assume that the BIOS does some weird mappings and that the
id for socket0 is indeed 0
but for socket1 it is 255. Then doing:
pkg = topology_physical_package_id();
pmu->boxes[pkg];
Would crash because the boxes[] has space for 2 elements and not 256.
If we know this cannot happen, then the code is fine. If we are not
sure, then I believe a check should be added
and if a mismatch is found, then the uncore PMU init code should
return an error. That's all I was trying to point
out. I think Thomas' code is indeed a good simplification.