Re: [RFC PATCH 1/1] Softirq:avoid large sched delay from the pending softirqs

From: jun qian
Date: Mon Jul 20 2020 - 07:33:24 EST


On Sat, Jul 18, 2020 at 6:07 AM Uladzislau Rezki <urezki@xxxxxxxxx> wrote:
>
> > From: jun qian <qianjun.kernel@xxxxxxxxx>
> >
> > When get the pending softirqs, it need to process all the pending
> > softirqs in the while loop. If the processing time of each pending
> > softirq is need more than 2 msec in this loop, or one of the softirq
> > will running a long time, according to the original code logic, it
> > will process all the pending softirqs without wakeuping ksoftirqd,
> > which will cause a relatively large scheduling delay on the
> > corresponding CPU, which we do not wish to see. The patch will check
> > the total time to process pending softirq, if the time exceeds 2 ms
> > we need to wakeup the ksofirqd to aviod large sched delay.
> >
> > Signed-off-by: jun qian <qianjun.kernel@xxxxxxxxx>
> > ---
> > kernel/softirq.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index c4201b7f..602d9fa 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -299,6 +299,9 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > }
> > h++;
> > pending >>= softirq_bit;
> > +
> > + if (time_after(jiffies, end) && need_resched())
> > + break;
> > }
> >
> > if (__this_cpu_read(ksoftirqd) == current)
> >
> I have a small concern about MAX_SOFTIRQ_TIME. The problem is that
> an "end" time is based on jiffies/tick update, so it depends on CONFIG_HZ
> value of your kernel.
>
> For example if we have CONFIG_HZ=100, msecs_to_jiffies(2) will return 1.
> For HZ=100 one jiffie is 10 milliseconds. So we can not rely on it,
> because of low resolution.
>
good tip. Does this problem also exist in the current code, just like this:

if (pending) {
if (time_before(jiffies, end) && !need_resched() &&
/* low resolution problem */
--max_restart)
goto restart;

wakeup_softirqd();
}

> Maybe it make sense to fix it first in order to be at least aligned with
> "2 milliseconds time limit" documentation?
>
> <snip>
> * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
> * but break the loop if need_resched() is set or after 2 ms.
> <snip>
>
I can't find the snip from the linux/Documentation/, could you please
tell me where I can find this snip, thks

> ktime_get()/ktime_before()...?
>
if the low resolution problem also exists in the above code, i think
also need to fix it with using ktime_get()/ktime_before().

> --
> Vlad Rezki