Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

From: Qais Yousef
Date: Mon Sep 14 2020 - 07:32:01 EST


On 09/11/20 20:28, peterz@xxxxxxxxxxxxx wrote:
> On Fri, Sep 11, 2020 at 05:46:45PM +0100, Qais Yousef wrote:
> > On 09/09/20 17:09, qianjun.kernel@xxxxxxxxx wrote:
> > > From: jun qian <qianjun.kernel@xxxxxxxxx>
> > >
> > > When get the pending softirqs, it need to process all the pending
> > > softirqs in the while loop. If the processing time of each pending
> > > softirq is need more than 2 msec in this loop, or one of the softirq
> > > will running a long time, according to the original code logic, it
> > > will process all the pending softirqs without wakeuping ksoftirqd,
> > > which will cause a relatively large scheduling delay on the
> > > corresponding CPU, which we do not wish to see. The patch will check
> > > the total time to process pending softirq, if the time exceeds 2 ms
> > > we need to wakeup the ksofirqd to aviod large sched delay.
> > >
> > > Signed-off-by: jun qian <qianjun.kernel@xxxxxxxxx>
> >
> > In Android there's a patch that tries to avoid schedling an RT task on a cpu
> > that is running softirqs. I wonder if this patch helps with this case.
> >
> > https://android.googlesource.com/kernel/msm/+/5c3f54c34acf4d9ed01530288d4a98acff815d79%5E%21/#F0
> >
> > John, Wei, is this something of interest to you?
>
> Urgh.. that's pretty gross. I think the sane approach is indeed getting
> softirqs to react to need_resched() better.

What does PREEMPT_RT do to deal with softirqs delays?

I have tried playing with enabling threadirqs, which AFAIU should make softirqs
preemptible, right?

I realize this patch is still missing from mainline at least:

https://gitlab.com/kalilinux/packages/linux/blob/a17bad0db9da44cd73f594794a58cc5646393b13/debian/patches-rt/softirq-Add-preemptible-softirq.patch

Would this be a heavy handed approach to make available for non PREEMPT_RT
kernels?

I only worry about potential NET_RX throughput issues. Which by the way is
protected with preempt_disable currently in mainline. See netif_rx_ni().

I am guessing here, but I suspect this NET_RX softirq is one source of big
delays when network activity is high.

Thanks

--
Qais Yousef