Re: Softirq priority inversion from "softirq: reduce latencies"

From: Peter Hurley
Date: Mon Feb 29 2016 - 10:54:37 EST

Next message: Dmitry Vyukov: "Re: fs: NULL deref in atime_needs_update"
Previous message: Toralf FÃrster: "Re: small diff in reported GiB size of an SSD"
In reply to: Eric Dumazet: "Re: Softirq priority inversion from "softirq: reduce latencies""
Next in thread: Eric Dumazet: "Re: Softirq priority inversion from "softirq: reduce latencies""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 02/29/2016 07:19 AM, Eric Dumazet wrote:
> On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote:
>
>> Not the case. The softirq is raised from interrupt.
>>
>> Before Eric's change, when an interrupt raises a new softirq
>> while processing another softirq, the new softirq is immediately
>> processed *after the existing softirq completes*.
>>
>> After Eric's change, when an interrupt raises a new softirq
>> while processing another softirq and _that softirq wakes a process_,
>> the new softirq is *deferred to normal process priority*.
>
> For the last time, this is not true.
>
> My patch changed the probability for this to happen.

There is a huge difference between
1. heavy i/o load forcing ksoftirqd to battle out i/o with regular
sched processes *as a fallback to avoid 100% softirq* and
2. always deferring new softirq just because a process was woken

> It will happen even if you revert it.

I think there is a happy medium where finer constraints on
softirq looping will get us both what we want.

For example, an accumulating mask of softirq already run would
keep one softirq level from looping over-and-over. Or a per-softirq
limiting counter. Or relying on the hard limit that was added later
of a fixed number of softirq loops. Or a combination of those.

> linux never claimed that softirq could steal all cpu time.

That's not the problem observed here.

In fact, what your patch triggers is exactly the opposite:
although cpu load is initially very light because DMA is used to perform
device i/o, once DMA is not being serviced in a timely manner, the
driver fallbacks to purely interrupt-driven i/o which dramatically
increases the real cpu load at those line rates.

> Are by any chance still running a HZ=100 kernel ?

The current kernel is HZ=250 but this would occur on HZ=1000 as well.

Regards,
Peter Hurley

Next message: Dmitry Vyukov: "Re: fs: NULL deref in atime_needs_update"
Previous message: Toralf FÃrster: "Re: small diff in reported GiB size of an SSD"
In reply to: Eric Dumazet: "Re: Softirq priority inversion from "softirq: reduce latencies""
Next in thread: Eric Dumazet: "Re: Softirq priority inversion from "softirq: reduce latencies""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]