Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default
From: Mark Rutland
Date: Tue Apr 07 2026 - 07:21:00 EST
On Sun, Apr 05, 2026 at 12:21:55AM -0400, Andres Freund wrote:
> On 2026-04-04 21:40:29 -0400, Andres Freund wrote:
> > On 2026-04-04 13:42:22 -0400, Andres Freund wrote:
> > The benchmark script seems to indicate that huge pages aren't in use:
> > https://github.com/aws/repro-collection/blob/main/workloads/postgresql/main.sh#L15
For the benefit of those reading mail without a browser, the line in
question is:
${PG_HUGE_PAGES:=off} # off, try, on
Per the PostgreSQL 17 documentation:
https://www.postgresql.org/docs/17/runtime-config-resource.html#GUC-HUGE-PAGES
... the default is 'try', though IIUC some additional system
configuration may be necessary, to actually reserve huge pages, which
is also documented:
https://www.postgresql.org/docs/17/kernel-resources.html#LINUX-HUGE-PAGES
> > I wonder if somehow the pages underlying the portions of postgres' shared
> > memory are getting paged out for some reason, leading to page faults while
> > holding the spinlock?
>
> Hah. I had reflexively used huge_pages=on - as that is the only sane thing to
> do with 10s to 100s of GB of shared memory and thus part of all my
> benchmarking infrastructure - during the benchmark runs mentioned above.
Salvatore, was there a specific reason to test with PG_HUGE_PAGES=off
rather than PG_HUGE_PAGES=try? Was that arbitrary (e.g. because it was
the first of the possible options)?
IIUC from what Andres says here (and in other mails in this thread),
that's not a sensible/realistic configuration for this sort of workload,
and is the root cause of the contention (which seems to be exacerbated
by the scheduler model change).
As Andres noted, even ignoring the scheduler model, running with
PG_HUGE_PAGES=off results in a substantial performance penalty:
> *regardless* the spinlock. PG 19 does have the spinlock in this path anymore,
> but not using huge pages is still utterly terrible (like 1/3 of the
> throughput).
>
> I did run some benchmarks here and I don't see a clearly reproducible
> regression with huge pages.
Is the PG_HUGE_PAGES=off configuration important to you for some reason?
Mark.