Re: [PATCH]cpuset: add new API to change cpuset top group's cpus

From: Peter Zijlstra
Date: Tue May 19 2009 - 18:37:50 EST


On Tue, 2009-05-19 at 15:01 -0400, Len Brown wrote:
> > ... the point is, we
> > don't need a new interface to force a cpu idle. Hotplug does that.
> >
> > Furthermore, we should not want anything outside of that, either the cpu
> > is there available for work, or its not -- halfway measures don't make
> > sense.
> >
> > Furthermore, we already have power aware scheduling which tries to
> > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > power savings. Use it there.
>
> Some context...

<snip default story of thermal overcommit>

> > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > cpuidle governor potentially can't handle offline cpu.
> > >
> > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > and I've no idea why its still there, seems like a much better candidate
> > > for your efforts than this.
>
> CONFIG_HOTPLUG_CPU has been problematic in the past.
> It does more than what we need here, so we thought
> a lighter-weight and lower-latency method that simply
> didn't schedule to the idled cpu would suffice.

> We are fixing the hotplug-unplug idle loop, but there
> turns out to be some issues with it related to idle
> processors with interrupts disabled that don't actually
> get down into the deep C-states we request:-(
>
> So this is why you see a patch for a "halfway measure",
> it does what is necessary, and does nothing more.

Its broken, its ill-defined and its not going to happen.

Ripping cpus out of the top cpuset might upset the cpuset configuration
and has no regards for any realtime processes. And I must take back my
earlier suggestion, hotplug is a bad solution too.

There's just too much user policy (cpuset configuration) to upset.

The IBM folks are working on a scheduler based solution, please talk to
them.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/