Re: [PATCH RFC hack dont apply] intel_idle: support running within a VM
From: Thomas Gleixner
Date: Tue Oct 03 2017 - 17:03:16 EST
On Mon, 2 Oct 2017, Jacob Pan wrote:
> On Sat, 30 Sep 2017 01:21:43 +0200
> "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:
>
> > On Sat, Sep 30, 2017 at 12:01 AM, Michael S. Tsirkin <mst@xxxxxxxxxx>
> > wrote:
> > > intel idle driver does not DTRT when running within a VM:
> > > when going into a deep power state, the right thing to
> > > do is to exit to hypervisor rather than to keep polling
> > > within guest using mwait.
> > >
> > > Currently the solution is just to exit to hypervisor each time we go
> > > idle - this is why kvm does not expose the mwait leaf to guests even
> > > when it allows guests to do mwait.
> > >
> > > But that's not ideal - it seems better to use the idle driver to
> > > guess when will the next interrupt arrive.
> >
> > The idle driver alone is not sufficient for that, though.
> >
> I second that. Why try to solve this problem at vendor specific driver
> level? perhaps just a pv idle driver that decide whether to vmexit
> based on something like local per vCPU timer expiration? I guess we
> can't predict other wake events such as interrupts.
> e.g.
> if (get_next_timer_interrupt() > kvm_halt_target_residency)
Bah. no. get_next_timer_interrupt() is not available for abuse in random
cpuidle driver code. It has state and its tied to the nohz code.
There is the series from Audrey which makes use of the various idle
prediction mechanisms, scheduler, irq timings, idle governor to get an idea
about the estimated idle time. Exactly this information can be fed to the
kvmidle driver which can act accordingly.
Hacking a random hardware specific idle driver is definitely the wrong
approach. It might be useful to chain the kvmidle driver and hardware
specific drivers at some point, i.e. if the kvmdriver decides not to exit
it delegates the mwait decision to the proper hardware driver in order not
to reimplement all the required logic again. But that's a different story.
See http://lkml.kernel.org/r/1506756034-6340-1-git-send-email-aubrey.li@xxxxxxxxx
Thanks,
tglx