Re: [PATCH 0/3] softirq: uncontroversial change

From: Paolo Abeni
Date: Fri Apr 21 2023 - 05:34:56 EST


On Fri, 2023-04-21 at 10:48 +0800, Jason Xing wrote:
>
> > My understanding is that we want to avoid adding more heuristics here,
> > preferring a consistent refactor.
> >
> > I would like to propose a revert of:
> >
> > 4cd13c21b207 softirq: Let ksoftirqd do its job
> >
> > the its follow-ups:
> >
> > 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> > 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
>
> More than this, I list some related patches mentioned in the above
> commit 3c53776e29f8:
> 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
> 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do
> not get to run")
> 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")

The first 2 changes replace plain timers with HR ones, could possibly
be reverted, too, but it should not be a big deal either way.

I think instead we want to keep the third commit above, as it should be
useful when napi threaded is enabled.

Generally speaking I would keep the initial revert to the bare minimum.

> > The problem originally addressed by 4cd13c21b207 can now be tackled
> > with the threaded napi, available since:
> >
> > 29863d41bb6e net: implement threaded-able napi poll loop support
> >
> > Reverting the mentioned commit should address the latency issues
> > mentioned by Jakub - I verified it solves a somewhat related problem in
> > my setup - and reduces the layering of heuristics in this area.
>
> Sure, it is. I also can verify its usefulness in the real workload.
> Some days ago I also sent a heuristics patch [1] that can bypass the
> ksoftirqd if the user chooses to mask some type of softirq. Let the
> user decide it.
>
> But I observed that if we mask some softirqs, or we can say,
> completely revert the commit 4cd13c21b207, the load would go higher
> and the kernel itself may occupy/consume more time than before. They
> were tested under the similar workload launched by our applications.
>
> [1]: https://lore.kernel.org/all/20230410023041.49857-1-kerneljasonxing@xxxxxxxxx/

Thanks for the reference, I would have missed that patch otherwise.

My understanding is that adding more knobs here is in the opposite
direction of what Thomas is suggesting, and IMHO the 'now mask' should
not be exposed to user-space.

>
Thanks for the feedback,

Paolo