Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

From: Qais Yousef
Date: Mon Sep 14 2020 - 11:29:13 EST


On 09/14/20 16:14, peterz@xxxxxxxxxxxxx wrote:
> On Mon, Sep 14, 2020 at 12:27:35PM +0100, Qais Yousef wrote:
> > What does PREEMPT_RT do to deal with softirqs delays?
>
> Makes the lot preemptible, you found the patch below.
>
> > I have tried playing with enabling threadirqs, which AFAIU should make softirqs
> > preemptible, right?
>
> Not yet,..
>
> > I realize this patch is still missing from mainline at least:
> >
> > https://gitlab.com/kalilinux/packages/linux/blob/a17bad0db9da44cd73f594794a58cc5646393b13/debian/patches-rt/softirq-Add-preemptible-softirq.patch
> >
> > Would this be a heavy handed approach to make available for non PREEMPT_RT
> > kernels?
>
> Not sure, I suspect it relies on migrate_disable(), which is
> preempt_disable() on !RT and then we're back to square one.

I think it will depend on local_bh_disable(). I didn't dig into the patch
above, but I believe it's doing that for RT.

Or maybe there's another aspect I am not aware of that relies on
migrate_disable() too..

>
> > I only worry about potential NET_RX throughput issues. Which by the way is
> > protected with preempt_disable currently in mainline. See netif_rx_ni().
>
> So preempt_disable() isn't necessairily a problem, you just want it to

Yes. But high network traffic will make this a busy softirq. And it won't work
with the patch above. But I assume the above will have to fix that with it.

https://lore.kernel.org/netdev/20170616172400.10809-1-bigeasy@xxxxxxxxxxxxx/

For the time being, it's just another potential path that could introduce
latencies.

I can't follow the whole thing too, but if 5G modems ends up there; I can see
this a big source of noise when the user is downloading a big file. Assuming 5g
lives up to its reputation of 400+ Mbps in practice.

So there might be a tangible trade off between better softirqs latencies vs
better network throughput.

> terminate soonish after need_resched() becomes true. Also, I'm having a
> wee problem getting from net_rx_action() to netif_rx_ni()

I can investigate this direction :)

>
> > I am guessing here, but I suspect this NET_RX softirq is one source of big
> > delays when network activity is high.
>
> Well, one approach is to more agressively limit how long softirq
> processing can run. Current measures are very soft in that regard.

Which is this patch. Although it doesn't take into account a single softirq
exceeding the quota IIUC. But the need_resched() bit above should address that.

Thanks

--
Qais Yousef