Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default

From: Peter Zijlstra

Date: Tue Apr 07 2026 - 04:23:38 EST


On Sun, Apr 05, 2026 at 11:38:59AM +0530, Ritesh Harjani wrote:

> However, for curiosity, I was hoping if someone more familiar with the
> scheduler area can explain why PREEMPT_LAZY v/s PREEMPT_NONE, causes
> performance regression w/o huge pages?
>
> Minor page fault handling has micro-secs latency, where as sched ticks
> is in milli-secs. Besides, both preemption models should anyway
> schedule() if TIF_NEED_RESCHED is set on return to userspace, right?
>
> So was curious to understand how is the preemption model causing
> performance regression with no hugepages in this case?

So yes, everything can schedule on return-to-user (very much including
NONE). Which is why rseq slice ext is heavily recommended for anything
attempting user space spinlocks.

The thing where the other preemption modes differ is the scheduling
while in kernel mode. So if the workload is spending significant time in
the kernel, this could cause more scheduling.

As you already mentioned, no huge pages, gives us more overhead on #PF
(and TLB miss, but that's mostly hidden in access latency rather than
immediate system time). This gives more system time, and more room to
schedule.

If we get preempted in the middle of a #PF, rather than finishing it,
this increases the #PF completion time and if userspace is trying to
access this page concurrently.... But we should see that in mmap_lock
contention/idle time :/

I'm not sure I can explain any of this.