Re: [REGRESSION 4.20-rc1] 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")

From: tomli
Date: Tue Feb 05 2019 - 20:22:41 EST

> OK, thanks. This looks slightly different from the Loongson problem:
> - In Loongson, we don't get stuck in RCU, but in
> cpufreq_dbs_governor_stop -> irq_work_sync().
> - I run non-preemptible kernel, and my system still gets stuck.
> What is common is that it's UP with i8259 PIC.
> A.

Now it's an interesting case. Because on my machine, the problem I
encountered seems to be the identical one of the original thread,
disabling preempting can effectively solve the lockup. Also, my
issue is not only occuring on 4.20-rc1, but also on earlier kernels,
with a lower probability.

But on your machine, you have another non-identical, but closely-
related issue. It seems the timing-dependent lockup of i8259 PIC can
be triggered in different ways.

The conclusion is clear though, there's a real lockup condition in
i8259 PIC driver, and it's causing real issues. Aaro, have you tried
submitting your i8259 patch to the mainline? Despite your concerns
about its underlying cause, I think a fix should be submitted. If there
are no objections from the maintainers, I suggest submitting it to the
mainline upstream, and send it to linux-stable, requesting it to be
applied on 3.16, 4.4, 4.9, 4.14, 4.19, 4.20 stable branches. If you
are busy, I can help submitting.

Tom Li