Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

From: Quan Xu
Date: Tue Nov 14 2017 - 06:43:24 EST

Next message: Stefan Schake: "Re: [PATCH 2/2] drm/vc4: Ensure interrupts are disabled"
Previous message: Arnd Bergmann: "[PATCH] debugfs: fix debugfs_real_fops() build error"
In reply to: Juergen Gross: "Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops"
Next in thread: Juergen Gross: "Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2017/11/14 18:27, Juergen Gross wrote:

On 14/11/17 10:38, Quan Xu wrote:

On 2017/11/14 15:30, Juergen Gross wrote:

On 14/11/17 08:02, Quan Xu wrote:

On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu <quan.xu0@xxxxxxxxx>

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like
message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang <yang.zhang.wz@xxxxxxxxx>
Signed-off-by: Quan Xu <quan.xu0@xxxxxxxxx>
Cc: Juergen Gross <jgross@xxxxxxxx>
Cc: Alok Kataria <akataria@xxxxxxxxxx>
Cc: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx

Hmm, is the idle entry path really so critical to performance that a
new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
ÂÂ1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
ÂÂÂÂ 29031.6 bit/s -- 76.1 %CPU

ÂÂ2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
ÂÂÂÂ 35787.7 bit/s -- 129.4 %CPU

ÂÂ3. w/ kvm dynamic poll:
ÂÂÂÂ 35735.6 bit/s -- 200.0 %CPU

ÂÂ4. w/patch and w/ kvm dynamic poll:
ÂÂÂÂ 42225.3 bit/s -- 198.7 %CPU

ÂÂ5. idle=poll
ÂÂÂÂ 37081.7 bit/s -- 998.1 %CPU

ÂÂw/ this patch, we will improve performance by 23%.. even we could
improve
ÂÂperformance by 45.4%, if we use w/patch and w/ kvm dynamic poll.
also the
ÂÂcost of CPU is much lower than 'idle=poll' case..

I don't question the general idea. I just think pvops isn't the best way
to implement it.

Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this
would
work on other architectures, too.

I assume this feature will be ported to other archs.. a new pvops makes

ÂÂÂÂÂ sorry, a typo.. /other archs/other hypervisors/
ÂÂÂÂÂ it refers hypervisor like Xen, HyperV and VMware)..

code
clean and easy to maintain. also I tried to add it into existed pvops,
but it
doesn't match.

You are aware that pvops is x86 only?

yes, I'm aware..

I really don't see the big difference in maintainability compared to the
static key / function pointer variant:

void (*guest_idle_poll_func)(void);
struct static_key guest_idle_poll_key __read_mostly;

static inline void guest_idle_poll(void)
{
ÂÂÂÂif (static_key_false(&guest_idle_poll_key))
ÂÂÂÂÂÂÂ guest_idle_poll_func();
}

thank you for your sample code :)
I agree there is no big difference.. I think we are discussion for two
things:
Â1) x86 VM on different hypervisors
Â2) different archs VM on kvm hypervisor

What I want to do is x86 VM on different hypervisors, such as kvm / xen
/ hyperv ..

Why limit the solution to x86 if the more general solution isn't
harder?

As you didn't give any reason why the pvops approach is better other
than you don't care for non-x86 platforms you won't get an "Ack" from
me for this patch.

It just looks a little odder to me. I understand you care about no-x86 arch.

Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in
ÂÂ - arch/arm64/include/asm/paravirt.h
ÂÂ - arch/x86/include/asm/paravirt_types.h
ÂÂ - arch/arm/include/asm/paravirt.h

I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops
for arm/arm64 arch, you'd define a same structure in
ÂÂ - arch/arm64/include/asm/paravirt.hÂÂÂÂ or
ÂÂ - arch/arm/include/asm/paravirt.h

.. instead of static key / fuction.

then implement a real function in
ÂÂ - arch/arm/kernel/paravirt.c.

Also I wonder HOW/WHERE to define a static key/function, then to benifit
x86/no-x86 archs?

Quan
Alibaba Cloud

And KVM would just need to set guest_idle_poll_func and enable the
static key. Works on non-x86 architectures, too.

.. referred to 'pv_mmu_ops', HyperV and Xen can implement their own
functions for 'pv_mmu_ops'.
I think it is the same to pv_idle_ops.

with above explaination, do you still think I need to define the static
key/function pointer variant?

btw, any interest to port it to Xen HVM guest? :)

Maybe. But this should work for Xen on ARM, too.

Juergen

Next message: Stefan Schake: "Re: [PATCH 2/2] drm/vc4: Ensure interrupts are disabled"
Previous message: Arnd Bergmann: "[PATCH] debugfs: fix debugfs_real_fops() build error"
In reply to: Juergen Gross: "Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops"
Next in thread: Juergen Gross: "Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]