Re: [PATCH v5 0/4] sched: Fix missing irq time when CONFIG_IRQ_TIME_ACCOUNTING is enabled

From: Yafang Shao
Date: Thu Nov 21 2024 - 22:18:45 EST


On Mon, Nov 18, 2024 at 6:10 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Sun, Nov 17, 2024 at 10:56:21AM +0800, Yafang Shao wrote:
> > On Fri, Nov 15, 2024 at 9:41 PM Michal Koutný <mkoutny@xxxxxxxx> wrote:
>
> > > > The load balancer is malfunctioning due to the exclusion of IRQ time from
> > > > CPU utilization calculations.
> > >
> > > Could this be fixed by subtracting (global) IRQ time from (presumed
> > > total) system capacity that the balancer uses for its decisions? (i.e.
> > > without exact per-cgroup breakdown of IRQ time)
> >
> > The issue here is that the global IRQ time may include the interrupted
> > time of tasks outside the target cgroup. As a result, I don't believe
> > it's possible to find a reliable solution without modifying the
> > kernel.
>
> Since there is no relation between the interrupt and the interrupted
> task (and through that its cgroup) -- all time might or might not be
> part of your cgroup of interest. Consider it a random distribution if
> you will.

Some points require further clarification.

On our servers, the majority of IRQ/softIRQ activity originates from
network traffic, and we consistently enable Receive Flow Steering
(RFS) [0]. This configuration ensures that softIRQs are more likely to
interrupt the tasks responsible for processing the corresponding
packets. As a result, the distribution of softIRQs is not random but
instead closely aligned with the packet-handling tasks.

[0]. https://lwn.net/Articles/381955/