Re: Softirq priority inversion from "softirq: reduce latencies"

From: Eric Dumazet
Date: Sat Feb 27 2016 - 21:00:43 EST


On sam., 2016-02-27 at 15:33 -0800, Peter Hurley wrote:
> On 02/27/2016 03:04 PM, David Miller wrote:
> > From: Peter Hurley <peter@xxxxxxxxxxxxxxxxxx>
> > Date: Sat, 27 Feb 2016 12:29:39 -0800
> >
> >> Not really. softirq raised from interrupt context will always execute
> >> on this cpu and not in ksoftirqd, unless load forces softirq loop abort.
> >
> > That guarantee never was specified.
>
> ??
>
> Neither is running network socket servers at normal priority as if they're
> higher priority than softirq.
>
>
> > Or are you saying that by design, on a system under load, your UART
> > will not function properly?
> >
> > Surely you don't mean that.
>
> No, that's not what I mean.
>
> What I mean is that bypassing the entire SOFTIRQ priority so that
> sshd can process one network packet makes a mockery of the point of softirq.
>
> This hack to workaround NET_RX looping over-and-over-and-over affects every
> subsystem, not just one uart.
>
> HI, TIMER, BLOCK; all of these are skipped: that's straight-up, a bug.

No idea what you talk about.

All pending softirq interrupts are processed. _Nothing_ is skipped.

Really, your system stability seems to depend on a completely
undocumented behavior of linux kernels before linux-3.8

If I understood, you expect that a tasklet activated from a softirq
handler is run from the same __do_softirq() loop. This never has been
the case.

My change simply triggers the bug in your driver earlier. As David
pointed out, your bug should trigger the same on a loaded machine, even
if you revert my patch.

I honestly do not know why you arm a tasklet from NET_RX, why don't you
simply process this directly, so that you do not rely on some scheduler
decision ?