Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default
From: Peter Zijlstra
Date: Fri Apr 03 2026 - 17:32:26 EST
On Fri, Apr 03, 2026 at 07:19:36PM +0000, Salvatore Dipietro wrote:
> We are reporting a throughput and latency regression on PostgreSQL
> pgbench (simple-update) on arm64 caused by commit 7dadeaa6e851
> ("sched: Further restrict the preemption modes") introduced in
> v7.0-rc1.
>
> The regression manifests as a 0.51x throughput drop on a pgbench
> simple-update workload with 1024 clients on a 96-vCPU
> (AWS EC2 m8g.24xlarge) Graviton4 arm64 system. Perf profiling
> shows 55% of CPU time is consumed spinning in PostgreSQL's
> userspace spinlock (s_lock()) under PREEMPT_LAZY:
>
> |- 56.03% - StartReadBuffer
> |- 55.93% - GetVictimBuffer
> |- 55.93% - StrategyGetBuffer
> |- 55.60% - s_lock <<<< 55% of time
> | |- 0.39% - el0t_64_irq
> | |- 0.10% - perform_spin_delay
> |- 0.08% - LockBufHdr
> |- 0.07% - hash_search_with_hash_value
> |- 0.40% - WaitReadBuffers
The fix here is to make PostgreSQL make use of rseq slice extension:
https://lkml.kernel.org/r/20251215155615.870031952@xxxxxxxxxxxxx
That should limit the exposure to lock holder preemption (unless
PostgreSQL is doing seriously egregious things).