Re: [patch 0/3] KVM CPU frequency change hypercalls

From: Marcelo Tosatti
Date: Tue Feb 28 2017 - 16:16:11 EST


On Fri, Feb 24, 2017 at 04:34:52PM +0100, Paolo Bonzini wrote:
>
>
> On 24/02/2017 14:04, Marcelo Tosatti wrote:
> >>>>> Whats the current usecase, or forseeable future usecase, for save/restore
> >>>>> across preemption again? (which would validate the broken by design
> >>>>> claim).
> >>>> Stop a guest that is using cpufreq, start a guest that is not using it.
> >>>> The second guest's performance now depends on the state that the first
> >>>> guest left in cpufreq.
> >>> Nothing forbids the host to implement switching with the
> >>> current hypercall interface: all you need is a scheduler
> >>> hook.
> >> Can it be done in vcpu_load/vcpu_put? But you still would have two
> >> components (KVM and sysfs) potentially fighting over the frequency, and
> >> that's still a bit ugly.
> >
> > Change the frequency at vcpu_load/vcpu_put? Yes: call into
> > cpufreq-userspace. But there is no notion of "per-task frequency" on the
> > Linux kernel (which was the starting point of this subthread).
>
> There isn't, but this patchset is providing a direct path from a task to
> cpufreq-userspace. This is as close as you can get to a per-task frequency.

Cpufreq-userspace is supposed to be used by tasks in userspace.
Thats why its called "userspace".

> > But if you configure all CPUs in the system as cpufreq-userspace,
> > then some other (userspace program) has to decide the frequency
> > for the other CPUs.
> >
> > Which agent would do that and why? Thats why i initially said "whats the
> > usecase".
>
> You could just pin them at the highest non-TurboBoost frequency until a
> guest runs. That's assuming that they are idle and, because of
> isol_cpus/nohz_full, they would be almost always in deep C state anyway.
>
> Paolo

The original claim of the thread was: "this feature (frequency
hypercalls) works for pinned vcpu<->pcpu, pcpu dedicated exclusively
to vcpu case, lets try to extend this to other cases".

Which is a valid and useful direction to go.

However there is no user for multiple vcpus in the same pcpu now.

If there were multiple vcpus, all of them requesting a given
frequency, it would be necessary to:

1) Maintain frequency of the pcpu to the highest
frequencies.

OR

2) Since switching frequencies can take up to 70us (*)
(depends on processor), its generally not worthwhile
to switch frequencies between task switches.

So its a dead end...

*: http://www.ena-hpc.org/2013/pdf/04.pdf