Re: Kernel 6.14.11 dl_server_timer(...) causing IPI/Function Call Interrupts on isolcpu/nohz_full cores, performance regression
From: Juri Lelli
Date: Fri Apr 24 2026 - 08:39:13 EST
Hello,
On 11/04/26 00:39, Cao Ruichuang wrote:
> Hi Juri,
>
> I tested the "sched/deadline: Make dl-server nohz full aware" change
> shape from f237e524f3c7 on the current mainline tree in a minimal QEMU
> setup.
>
> I used a latest-tree tiny x86 kernel with:
>
> nohz_full=1 isolcpus=domain,managed_irq,1 irqaffinity=0 rcu_nocbs=1
>
> and a CPU1-pinned busy loop as the only user workload. To observe the
> periodic activity, I installed a kprobe on start_dl_timer and counted
> hits over the same 5 second window.
>
> On the unmodified tree, I consistently saw:
>
> START_DL_TIMER_COUNT=118
>
> With the sched_can_stop_tick() change from f237e524f3c7 applied, I saw:
>
> START_DL_TIMER_COUNT=20
> START_DL_TIMER_COUNT=22
>
> So on current mainline this still looks like a real improvement in the
> same direction David reported earlier: the periodic dl-server activity
> is reduced substantially, although it is not eliminated completely in my
> QEMU setup.
>
> I am not sending this as a patch, only as an extra data point in favor
> of that fix direction on a latest-tree test setup.
Right. I also managed to reproduce again on mainline. The actual
difference from when I first start looking at this is that the "one per
second" dl-server period timer gets handled (and reprogrammed) on
housekeeping CPUs in my test setting (vng), which doesn't thus interrupt
the isolated busy workload, but it still seems useless.
In the meantime we added dl-server for scx as well, so my first attempt
gets a little uglier when trying to take that into account and I will
need more time to try and see if I can come up with something better.
Thanks,
Juri