Re: sched/fair: DELAY_DEQUEUE causes ~25% pipe IPC regression on Raspberry Pi 5
From: Tom Gebhardt
Date: Tue Apr 21 2026 - 08:49:40 EST
Hi Mike,
thank you for confirming that the DELAY_DEQUEUE impact is reproducible
with stress-ng 0.15.06 on the stock 6.12.75+rpt-rpi-2712 kernel — that
is exactly the validation we needed.
The fact that your local 6.12.y branch with patches 0047 and 0089
(Christian's "Prevent negative lag increase during delayed dequeue")
does not reproduce the issue is very encouraging. It suggests that the
downstream RPi kernel just hasn't picked up those fixes yet, which
explains why we still see the regression on stock rpi-6.12.y.
Regarding the preemption model data: both our 6.6.78 and 6.12.81
kernels are built with CONFIG_PREEMPT=y (full preemption), not
voluntary or lazy. Your observation that preempt=lazy shows ~28% lower
throughput even on the patched kernel is interesting in itself — we
were not aware of that sensitivity on arm64. On our hardware, 6.6.78
with full preempt gives ~2.34M ops/s, which is actually slightly above
your voluntary preempt score of ~2.26M, so the comparison tracks
reasonably well.
On the patched kernel, DELAY_DEQUEUE on/off causes less than 1%
difference in both preemption modes — compared to the ~50% regression
we observe on stock 6.12.81 with 0.15.06 (2.34M ops/s on 6.6.78 vs.
1.16M on 6.12.81, same hardware). That is a clear indication that 0089
is the key fix.
We also tested rpi-7.0.y (self-built, CONFIG_PREEMPT=y, 16k page size)
and 6.18.21. Here is the picture from our hardware (RPi5 C1-Stepping,
arm_freq=2400, stress-ng 0.15.06, 4 workers, 20 seconds):
Kernel pipe MB/s yield ns swapctx/s
-------------------------------------------------------
6.6.78-v8-16k+ 259.93 2072 2,060,126 ← reference
6.12.81-v8-16k+ 183.66 2164 2,081,527 ← DELAY_DEQUEUE=ON
6.12.81 ipc-perf 250.75 2310 2,070,523 ← DELAY_DEQUEUE=OFF
6.18.21-v8-16k+ 219.32 2073 2,089,853
7.0.0-v8-16k+ 206.43 2175 2,067,052
The DELAY_DEQUEUE=OFF result (250.75 MB/s) clearly confirms your
analysis: it recovers almost all of the regression versus 6.6 stock
(259.93 MB/s), leaving only ~4% gap. DELAY_DEQUEUE=ON on the same
kernel costs ~31%.
Interestingly, 7.0.0 still shows a significant regression versus
6.6.78 at the same frequency: −20% at 2400 MHz (206 vs 260 MB/s), −19%
at 2800 MHz OC. If patches 0047/0089 are already in 7.0, something
else may still be at play — or they have not yet landed in the
rpi-7.0.y tree.
Would it make sense to push 0047 and 0089 to the stable rpi-6.12.y
branch, or is the expectation that they land via the upstream LTS
route first?
Best regards,
Thomas Gebhardt
Am Di., 21. Apr. 2026 um 05:55 Uhr schrieb Mike Galbraith <efault@xxxxxx>:
>
> On Mon, 2026-04-20 at 12:59 +0200, Tom Gebhardt wrote:
> > Hi Mike,
> >
> > thank you for testing — but I notice that your data only shows 6.12.75
> > with DELAY_DEQUEUE on/off. There is no 6.6 baseline in your results.
> > That is the comparison that matters: 6.6 vs 6.12 on the same hardware
> > with the same tool.
> >
> > There is also a version difference worth noting: you are running
> > stress-ng 0.21.00, while our measurements used stress-ng 0.15.06 —
> > which was the version available on Raspberry Pi OS Bookworm at the
> > time of the original report. The pipe stressor calculation changed
> > significantly between those versions.
> >
> > This creates an uncomfortable coincidence: a measurable scheduler
> > regression was introduced in 6.12 (DELAY_DEQUEUE), and around the same
> > time the standard measurement tool changed how it calculates the pipe
> > benchmark score. The result is that the regression becomes invisible
> > when comparing with the newer tool version — not because it was fixed,
> > but because the metric changed.
>
> Aha.
>
> > Would you be willing to run the 6.6 comparison with stress-ng 0.15.06
> > on your Pi 5?
>
> No need, your reported DELAY_DEQUEUE impact for 6.12.75+rpt-rpi-2712
> appeared with stress-ng 0.15.06.
>
> However, my local 6.12.y branch kernel, which has all eevdf fixes as
> well as Peter's still pending sched/ttwu series, still does NOT repro.
>
> git@homer:..git/raspberrypi-kernel> quilt applied|grep delay
> patches/eevdf/WIP/0043-sched-fair-Removed-unsued-cfs_rq.h_nr_delayed.patch
> patches/eevdf/WIP/0047-sched-fair-Do-not-try-to-migrate-delayed-dequeue-task.patch <== ?
> patches/eevdf/WIP/0065-sched-Change-ttwu_runnable-vs-sched_delayed.patch
> patches/eevdf/WIP/0066-sched-Add-ttwu_queue-support-for-delayed-tasks.patch
> patches/eevdf/WIP/0089-sched-fair-Prevent-negative-lag-increase-during-delayed-dequeue.patch <== ?
>
> -Mike
>
> aside: benchmark preemption model sensitivity magnitude poked me in the
> eye while testing, numbers below in case that's of interest to anyone.
>
> preempt=voluntary
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2566] setting to a 20 second run per stressor
> stress-ng: info: [2566] dispatching hogs: 4 pipe
> stress-ng: metrc: [2566] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2566] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2566] pipe 45294217 20.00 22.71 57.23 2264640.83 566587.74
> stress-ng: metrc: [2566] miscellaneous metrics:
> stress-ng: metrc: [2566] pipe 276.48 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2566] successful run completed in 20.01s
> rpi5:..debug/sched # echo NO_DELAY_DEQUEUE > features
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2650] setting to a 20 second run per stressor
> stress-ng: info: [2650] dispatching hogs: 4 pipe
> stress-ng: metrc: [2650] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2650] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2650] pipe 45379684 20.00 22.69 57.24 2268917.79 567687.53
> stress-ng: metrc: [2650] miscellaneous metrics:
> stress-ng: metrc: [2650] pipe 276.99 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2650] successful run completed in 20.00s
> rpi5:..debug/sched # echo DELAY_DEQUEUE > features
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2721] setting to a 20 second run per stressor
> stress-ng: info: [2721] dispatching hogs: 4 pipe
> stress-ng: metrc: [2721] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2721] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2721] pipe 45217909 20.00 21.81 58.13 2260821.48 565698.97
> stress-ng: metrc: [2721] miscellaneous metrics:
> stress-ng: metrc: [2721] pipe 276.00 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2721] successful run completed in 20.00s
> rpi5:..debug/sched #
>
> preempt=lazy
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2070] setting to a 20 second run per stressor
> stress-ng: info: [2070] dispatching hogs: 4 pipe
> stress-ng: metrc: [2070] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2070] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2070] pipe 32706095 20.00 17.88 62.02 1635263.85 409343.56
> stress-ng: metrc: [2070] miscellaneous metrics:
> stress-ng: metrc: [2070] pipe 199.63 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2070] successful run completed in 20.00s
> rpi5:..debug/sched # echo NO_DELAY_DEQUEUE > features
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2149] setting to a 20 second run per stressor
> stress-ng: info: [2149] dispatching hogs: 4 pipe
> stress-ng: metrc: [2149] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2149] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2149] pipe 32226171 20.00 17.93 61.98 1611264.75 403306.31
> stress-ng: metrc: [2149] miscellaneous metrics:
> stress-ng: metrc: [2149] pipe 196.70 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2149] successful run completed in 20.01s
> rpi5:..debug/sched # echo DELAY_DEQUEUE > features
> rpi5:..debug/sched # /usr/bin/stress-ng --pipe 4 --timeout 20s --metrics-brief
> stress-ng: info: [2210] setting to a 20 second run per stressor
> stress-ng: info: [2210] dispatching hogs: 4 pipe
> stress-ng: metrc: [2210] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
> stress-ng: metrc: [2210] (secs) (secs) (secs) (real time) (usr+sys time)
> stress-ng: metrc: [2210] pipe 32762333 20.00 18.08 61.86 1638067.42 409835.28
> stress-ng: metrc: [2210] miscellaneous metrics:
> stress-ng: metrc: [2210] pipe 199.99 MB per sec pipe write rate (geometic mean of 4 instances)
> stress-ng: info: [2210] successful run completed in 20.01s
>
>