Re: [PATCH 0/9] Enable haltpoll for arm64

From: Ankur Arora
Date: Tue Apr 30 2024 - 14:57:51 EST



> Subject: Re: [PATCH 0/9] Enable haltpoll for arm64

A correction: please read the subject for the series as [PATCH v5] ...

Missed the version number it while sending out.

Thanks
Ankur

Ankur Arora <ankur.a.arora@xxxxxxxxxx> writes:

> This patchset enables the cpuidle-haltpoll driver and its namesake
> governor on arm64. This is specifically interesting for KVM guests by
> reducing the IPC latencies.
>
> Comparing idle switching latencies on an arm64 KVM guest with
> perf bench sched pipe:
>
> usecs/op %stdev
>
> no haltpoll (baseline) 13.48 +- 5.19%
> with haltpoll 6.84 +- 22.07%
>
>
> No change in performance for a similar test on x86:
>
> usecs/op %stdev
>
> haltpoll w/ cpu_relax() (baseline) 4.75 +- 1.76%
> haltpoll w/ smp_cond_load_relaxed() 4.78 +- 2.31%
>
> Both sets of tests were on otherwise idle systems with guest VCPUs
> pinned to specific PCPUs. One reason for the higher stdev on arm64
> is that trapping of the WFE instruction by the host KVM is contingent
> on the number of tasks on the runqueue.
>
>
> The patch series is organized in four parts:
> - patches 1, 2 mangle the config option ARCH_HAS_CPU_RELAX, renaming
> and moving it from x86 to common architectural code.
> - next, patches 3-5, reorganize the haltpoll selection and init logic
> to allow architecture code to select it.
> - patch 6, reorganizes the poll_idle() loop, switching from using
> cpu_relax() directly to smp_cond_load_relaxed().
> - and finally, patches 7-9, add the bits for arm64 support.
>
> What is still missing: this series largely completes the haltpoll side
> of functionality for arm64. There are, however, a few related areas
> that still need to be threshed out:
>
> - WFET support: WFE on arm64 does not guarantee that poll_idle()
> would terminate in halt_poll_ns. Using WFET would address this.
> - KVM_NO_POLL support on arm64
> - KVM TWED support on arm64: allow the host to limit time spent in
> WFE.
>
>
> Changelog:
>
> v5:
> - rework the poll_idle() loop around smp_cond_load_relaxed() (review
> comment from Tomohiro Misono.)
> - also rework selection of cpuidle-haltpoll. Now selected based
> on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
> - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
> arm64 now depends on the event-stream being enabled.
> - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
> - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
>
> v4 changes from v3:
> - change 7/8 per Rafael input: drop the parens and use ret for the final check
> - add 8/8 which renames the guard for building poll_state
>
> v3 changes from v2:
> - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
> - add Ack-by from Rafael Wysocki on 2/7
>
> v2 changes from v1:
> - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
> (this improves by 50% at least the CPU cycles consumed in the tests above:
> 10,716,881,137 now vs 14,503,014,257 before)
> - removed the ifdef from patch 1 per RafaelW
>
> Ankur Arora (4):
> cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
> cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
> arm64: support cpuidle-haltpoll
> cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
>
> Joao Martins (4):
> Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
> cpuidle-haltpoll: define arch_haltpoll_supported()
> governors/haltpoll: drop kvm_para_available() check
> arm64: define TIF_POLLING_NRFLAG
>
> Mihai Carabas (1):
> cpuidle/poll_state: poll via smp_cond_load_relaxed()
>
> arch/Kconfig | 3 +++
> arch/arm64/Kconfig | 10 ++++++++++
> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
> arch/arm64/include/asm/thread_info.h | 2 ++
> arch/x86/Kconfig | 4 +---
> arch/x86/include/asm/cpuidle_haltpoll.h | 1 +
> arch/x86/kernel/kvm.c | 10 ++++++++++
> drivers/acpi/processor_idle.c | 4 ++--
> drivers/cpuidle/Kconfig | 5 ++---
> drivers/cpuidle/Makefile | 2 +-
> drivers/cpuidle/cpuidle-haltpoll.c | 9 ++-------
> drivers/cpuidle/governors/haltpoll.c | 6 +-----
> drivers/cpuidle/poll_state.c | 21 ++++++++++++++++-----
> drivers/idle/Kconfig | 1 +
> include/linux/cpuidle.h | 2 +-
> include/linux/cpuidle_haltpoll.h | 5 +++++
> 16 files changed, 79 insertions(+), 27 deletions(-)
> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h


--
ankur