Re: dvb usb issues since kernel 4.9
From: Linus Torvalds
Date: Mon Jan 08 2018 - 13:33:06 EST
On Mon, Jan 8, 2018 at 9:55 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> as I doubt we have enough time to root-case this properly.
Well, it's not like this is a new issue, and we don't have to get it
fixed for 4.15. It's been around since 4.9, it's not a "have to
suddenly fix it this week" issue.
I just think that people should plan on having to maybe revert it and
mark the revert for stable.
But if the USB or DVB layers can instead just make the packet queue a
bit deeper and not react so badly to the latency of a single softirq,
that would obviously be a good thing in general, and maybe fix this
issue. So I'm not saying that the revert is inevitable either.
But I have to say that that commit 4cd13c21b207 ("softirq: Let
ksoftirqd do its job") was a pretty damn big hammer, and entirely
ignored the "softirqs can have latency concerns" issue.
So I do feel like the UDP packet storm thing might want a somewhat
more directed fix than that huge hammer of trying to move softirqs
aggressively into the softirq thread.
This is not that different from threaded irqs. And while you can set
the "thread every irq" flag, that would be largely insane to do by
default and in general. So instead, people do it either for specific
irqs (ie "request_threaded_irq()") or they have a way to opt out of it
(IRQF_NO_THREAD).
I _suspect_ that the softirq thing really just wants the same thing.
Have the networking case maybe set the "prefer threaded" flag just for
networking, if it's less latency-sensitive for softirq handling than
In fact, even for networking, there are separate TX/RX softirqs, maybe
networking would only set it for the RX case? Or maybe even trigger it
only for cases where things queue up and it goes into a "polling mode"
(like NAPI already does).
Of course, I don't even know _which_ softirq it is that the DVB case
has issues with. Maybe it's the same NET_RX case?
But looking at that offending commit, I do note (for example), that we
literally have things like tasklet[_hi]_schedule() that might have
been explicitly expected to just run the tasklet at a fairly low
latency (maybe instead of a workqueue exactly because it doesn't need
to sleep and wants lower latency).
So saying "just because softirqd is possibly already woken up, let's
delay all those tasklets etc" does really seem very wrong to me.
Can somebody tell which softirq it is that dvb/usb cares about?
Linus