Re: [PATCH] sched/core: Drop spinlocks on contention iff kernel is preemptible

From: Friedrich Weber
Date: Thu Feb 01 2024 - 10:24:43 EST

Next message: Andrew Davis: "Re: [PATCH v3 2/5] arm64: dts: ti: k3-am62/a: use sub-node for USB_PHY_CTRL registers"
Previous message: Andy Shevchenko: "[PATCH v2 0/3] backlight: mp3309c: Allow to use on non-OF platforms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 10/01/2024 22:47, Sean Christopherson wrote:
> Use preempt_model_preemptible() to detect a preemptible kernel when
> deciding whether or not to reschedule in order to drop a contended
> spinlock or rwlock. Because PREEMPT_DYNAMIC selects PREEMPTION, kernels
> built with PREEMPT_DYNAMIC=y will yield contended locks even if the live
> preemption model is "none" or "voluntary". In short, make kernels with
> dynamically selected models behave the same as kernels with statically
> selected models.
>
> Somewhat counter-intuitively, NOT yielding a lock can provide better
> latency for the relevant tasks/processes. E.g. KVM x86's mmu_lock, a
> rwlock, is often contended between an invalidation event (takes mmu_lock
> for write) and a vCPU servicing a guest page fault (takes mmu_lock for
> read). For _some_ setups, letting the invalidation task complete even
> if there is mmu_lock contention provides lower latency for *all* tasks,
> i.e. the invalidation completes sooner *and* the vCPU services the guest
> page fault sooner.

I've been testing this patch for some time now:

Applied on top of Linux 6.7 (0dd3ee31) on a PREEMPT_DYNAMIC kernel with
preempt=voluntary, it fixes an issue for me where KVM guests would
temporarily freeze if NUMA balancing and KSM are active on a NUMA host.
See [1] for more details.

In addition, I've been running with this patch on my (non-NUMA)
workstation with (admittedly fairly light) VM workloads for two weeks
now and so far didn't notice any negative effects (this is on top of a
modified 6.5.11 kernel though).

Side note: I noticed the patch doesn't apply anymore on 6.8-rc2, seems
like sched.h was refactored in the meantime.

[1]
https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@xxxxxxxxxxx/

Next message: Andrew Davis: "Re: [PATCH v3 2/5] arm64: dts: ti: k3-am62/a: use sub-node for USB_PHY_CTRL registers"
Previous message: Andy Shevchenko: "[PATCH v2 0/3] backlight: mp3309c: Allow to use on non-OF platforms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]