Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event

From: Borislav Petkov
Date: Wed Sep 21 2016 - 10:01:49 EST


On Wed, Sep 21, 2016 at 09:32:47AM -0400, Prarit Bhargava wrote:
> This is not the right thing to do [1]. The topology directory should exist as
> long as the thread is present in the system. The thread (and its core) are
> still physically there, it's just that the thread is not available to the
> scheduler. The topology of the thread hasn't changed due to it being soft
> offlined this way.

So far so good.

> turbostat was modified to deal with the missing topology directory, and in tree
> utility cpupower prints out significantly less information when a thread is
> offline.

Why does it do that? Why does an offlined core change that info?

Concrete details please.

> ISTR a powertop bug due to hotplug too. This makes these monitoring
> utilities a problem for users who want only one thread per core.

one thread per core? What does that mean?

> This now means that
>
> echo 0 > /sys/devices/system/cpu/cpu29/online
>
> will result in the thread's topology directory staying around until the struct
> device associated with it is destroyed upon a physical socket hotplug event.

So your 2/2 says that on an offlined CPU, you have

/sys/devices/system/cpu/cpu10/topology/core_id:3
/sys/devices/system/cpu/cpu10/topology/core_siblings:0000
/sys/devices/system/cpu/cpu10/topology/core_siblings_list:
/sys/devices/system/cpu/cpu10/topology/physical_package_id:0
/sys/devices/system/cpu/cpu10/topology/thread_siblings:0000
/sys/devices/system/cpu/cpu10/topology/thread_siblings_list:

and this information is bollocks. core_siblings is 0, thread_siblings
is 0. You can just as well not have them there at all.

So is this whole jumping around just so that you can have a
/sys/devices/system/cpu/cpu10/topology directory and so that tools don't
get confused by it missing?

So again, what exactly are those tools accessing and how does the
offlined cores puzzle them?

A concrete example please:

"turbostat tries to access X and it is gone when the CPU is offlined so
this is a problem because it can't do Y"

Thanks.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton, HRB 21284 (AG NÃrnberg)
--