Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default

Next message: Dmitry Baryshkov: "Re: [PATCH] pinctrl: qcom: Remove unused macro definitions"
Previous message: Shuvam Pandey: "[PATCH] printf: add escaped-string format tests"
In reply to: Peter Zijlstra: "Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default"
Next in thread: Mitsumasa KONDO: "Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Qais Yousef

Date: Sun Apr 05 2026 - 20:43:54 EST

On 04/03/26 23:32, Peter Zijlstra wrote:
> On Fri, Apr 03, 2026 at 07:19:36PM +0000, Salvatore Dipietro wrote:
> > We are reporting a throughput and latency regression on PostgreSQL
> > pgbench (simple-update) on arm64 caused by commit 7dadeaa6e851
> > ("sched: Further restrict the preemption modes") introduced in
> > v7.0-rc1.
> >
> > The regression manifests as a 0.51x throughput drop on a pgbench
> > simple-update workload with 1024 clients on a 96-vCPU
> > (AWS EC2 m8g.24xlarge) Graviton4 arm64 system. Perf profiling
> > shows 55% of CPU time is consumed spinning in PostgreSQL's
> > userspace spinlock (s_lock()) under PREEMPT_LAZY:
> >
> > |- 56.03% - StartReadBuffer
> > |- 55.93% - GetVictimBuffer
> > |- 55.93% - StrategyGetBuffer
> > |- 55.60% - s_lock <<<< 55% of time
> > | |- 0.39% - el0t_64_irq
> > | |- 0.10% - perform_spin_delay
> > |- 0.08% - LockBufHdr
> > |- 0.07% - hash_search_with_hash_value
> > |- 0.40% - WaitReadBuffers
>
> The fix here is to make PostgreSQL make use of rseq slice extension:

Or perhaps use a longer base_slice_ns in debugfs? I think we end up just short
of 4ms in most systems now. 5 or 6 ms might help re-hide it.

>
> https://lkml.kernel.org/r/20251215155615.870031952@xxxxxxxxxxxxx
>
> That should limit the exposure to lock holder preemption (unless
> PostgreSQL is doing seriously egregious things).