Re: [PATCH 1/1] perf: Add CPU hotplug support for events
From: Peter Zijlstra
Date: Mon Feb 19 2018 - 04:11:29 EST
On Fri, Feb 16, 2018 at 05:48:21PM -0800, Raghavendra Rao Ananta wrote:
> I am sure we can fix it, but apart
> from the "why we are doing hotplug?" question, was was there specifically
> any issue with our patch?
Yes, the extra list is crazy. We don't keep events in extra lists when a
task isn't currently running either. A CPU being offline shouldn't be
(much) different from that.
Thinking more, we should also get rid of that HOTPLUG offset thing.
> > > > Also, you _still_ don't explain why you care about dead CPUs.
> > > >
> I wanted to understand, if we no longer care about hotplugging of CPUs, then
> why do we still have exported symbols such as cpu_up() and cpu_down()?
Not sure why we have them, legacy probably. The rcu/lock torture module
is the only legitimate user of them.
But if having those exports gives people the impression its a sane thing
to do hotplug from modules, we should just take it out.
> Moreover, we also have the hotplug interface exposed to users-space as well
> (through sysfs). As long as these interfaces exist, there's always a
> potential chance of bringing the CPU up/down. Can you please clear this
> thing up for me?
Hotplug is an absolute utter slow path. We do our absolute best to put
the entire burden on the hotplug path such that we don't perturb normal
things.
Its primary existence is for physical CPU hotplug, not resource
management. Although there seems to be a misguided 'I have this hammer,
everthing is a nail' thing going on.
I suppose these 'once' things like changing the topology of the machine
-- eg. 'unplug' all but one of the SMT threads, are OK as well.
And RAS things that take a CPU down when there's 'trouble' is also fine.
But anything that does hotplug semi regularly is batshit insane. One of
the first things hotplug does is synchronize_rcu(), that can take a
_long_ time.