Re: [RFC PATCH] perf/x86/intel/rapl: avoid access unallocate memory
From: Thomas Gleixner
Date: Tue Nov 08 2016 - 11:24:53 EST
On Tue, 8 Nov 2016, Charles (Chas) Williams wrote:
> On 11/08/2016 09:31 AM, Thomas Gleixner wrote:
> > On Tue, 8 Nov 2016, Charles (Chas) Williams wrote:
> > > [ 0.016335] topology_update_package_map: apicid 0 pkg 0 cpu 0
> > > [ 0.016398] smpboot: APIC(0) Converting physical 0 to logical
> > > package 0, cpu 0 (ffff88023fc0a040)
> > > [ 0.016399] topology_update_package_map: apicid 1 pkg 1 cpu 1
> > > [ 0.016462] smpboot: APIC(1) Converting physical 1 to logical
> > > package 1, cpu 1 (ffff88023fd0a040)
> > >
> > > So, I don't know where apic->cpu_present_to_apicid(cpu) is getting its
> > > apicid but it certainly doesn't seem to the match the apicid in the
> > > CPU's registers. For whatever reason, my VMware system is reporting
> > > that the second CPU has a local APIC ID of 2:
> >
> > The initial information comes from MP tables or ACPI.
> >
> > > [ 0.009115] identify_cpu: cpu_index 0 phys_proc_id is now 0,
> > > apicid 0, initial_apicid 0
> > > ...
> > > [ 0.237401] identify_cpu: cpu_index 1 phys_proc_id is now 2,
> > > apicid 2, initial_apicid 2
> >
> > And the CPUID emulation tells something different. Sigh!
> >
> > > I was thinking it might be better to call topology_update_package_map()
> > > at the bottom of identify_cpu() to setup the secondary CPU's. The boot
> > > cpu could be setup during smp_init_package_map().
> >
> > Perhaps, but that does not make the inconsistencies go away....
>
> By the time I know it's not consistent, there isn't anything I can do
> about it. I can't update the table to remove the bad information.
>
> The other alternative, is to trust the ACPI and just update the
> cpu_data's apicid in identify_cpu() to the value from the table.
> The earlier kernels didn't seem to rely as much on this information.
> But it does appear to be "wrong" in the APIC table. From acpidump:
>
> [02Ch 0044 1] Subtable Type : 00 [Processor Local APIC]
> [02Dh 0045 1] Length : 08
> [02Eh 0046 1] Processor ID : 00
> [02Fh 0047 1] Local Apic ID : 00
> [030h 0048 4] Flags (decoded below) : 00000001
> Processor Enabled : 1
>
> [034h 0052 1] Subtable Type : 00 [Processor Local APIC]
> [035h 0053 1] Length : 08
> [036h 0054 1] Processor ID : 01
> [037h 0055 1] Local Apic ID : 01
> [038h 0056 4] Flags (decoded below) : 00000001
> Processor Enabled : 1
Well, which one is wrong is hard to tell :)
I'll have a look how to sort that out.
Thanks,
tglx