Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default

Next message: Emre Cecanpunar: "[PATCH v3 1/5] platform/x86: hp-wmi: fix ignored return values in fan settings"
Previous message: Andres Freund: "Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default"
In reply to: Qais Yousef: "Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default"
Next in thread: Andres Freund: "Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Mitsumasa KONDO

Date: Sun Apr 05 2026 - 10:45:04 EST

I believe the root cause is the inadequacy of PostgreSQL's arm64
spin_delay() implementation, which PREEMPT_LAZY merely exposed.

PostgreSQL's SPIN_DELAY() uses dramatically different instructions
per architecture (src/include/storage/s_lock.h):

x86_64: rep; nop (PAUSE, ~140 cycles)
arm64: isb (pipeline flush, ~10-20 cycles)

Under PREEMPT_NONE, lock holders are rarely preempted, so spin
duration is short and ISB's lightweight delay is sufficient.

Under PREEMPT_LAZY, lock holder preemption becomes more frequent.
When this occurs, waiters enter a sustained spin loop. On arm64,
ISB provides negligible delay, so the loop runs at near-full speed,
hammering the lock cacheline via TAS_SPIN's *(lock) load on every
iteration. This generates massive cache coherency traffic that in
turn slows the lock holder's execution after rescheduling, creating
a feedback loop that escalates on high-core-count systems.

On x86_64, PAUSE throttles this loop sufficiently to prevent the
feedback loop, which explains why this is not reproducible there.

Patching PostgreSQL's arm64 spin_delay() to use WFE instead of ISB
should significantly reduce the regression without kernel changes.

That said, this change is likely to cause similar breakage in other
user-space applications beyond PostgreSQL that rely on lightweight
spin loops on arm64. So I agree that the patch to retain PREEMPT_NONE
is the right approach. At the same time, this is also something that
distributions can resolve by patching their default kernel configuration.

Regards,
--
Mitsumasa Kondo
NTT Software Innovation Center