Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()

From: Yasuaki Ishimatsu
Date: Mon Jan 30 2017 - 11:36:59 EST

Hi Thomas,

Do you have any idea to fix the issue?
If you have the idea, please send me the patch.

Yasuaki Ishimatsu

On 01/24/2017 02:54 PM, Thomas Gleixner wrote:
On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote:
rapl_cpu_prepare() must be called after logical package id of CPU
is set by topology_update_package_map().

But when onlining hot-added CPU, rapl_cpu_prepare() is called before
setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu()
in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and
rapl_cpu_prepare() initializes the wrong rapl_pmu.

After that logical package id of the hot-added CPU is set by
topology_update_package_map(). But rapl_cpu_prepare() does
not initialize pmu of the logical package id of the hot-added CPU.
So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and
the following NULL pointer dereference occurs.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: rapl_cpu_online+0x8d/0xb0
Call Trace:
? rapl_cpu_offline+0xc0/0xc0
? sort_range+0x30/0x30
? kthread_park+0x90/0x90

The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes
the position of cpuhp_state so that rapl_cpu_starting() is called
after topology_update_package_map().

Does not work. You cannot call that callback in the starting context. It
does allocations. This needs be fixed in a different way. I'll have a look