Re: [Bug 215533] [BISECTED][REGRESSION] UI becomes unresponsive every couple of seconds

From: Jan Kiszka
Date: Tue Feb 08 2022 - 01:35:57 EST


On 07.02.22 23:45, Bjorn Helgaas wrote:
> [+cc linux-kernel for visibility]
>
> On Wed, Jan 26, 2022 at 06:12:50AM -0600, Bjorn Helgaas wrote:
>> On Wed, Jan 26, 2022 at 08:18:12AM +0000, bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=215533
>>>
>>> --- Comment #1 from joey.corleone@xxxxxxx ---
>>> I accidentally sent the report prematurely. So here come my findings:
>>>
>>> Since 5.16
>>> (1) my system becomes unresponsive every couple of seconds (micro lags), which
>>> makes it more or less unusable.
>>> (2) wrong(?) CPU frequencies are reported.
>>>
>>> - 5.15 works fine.
>>> - Starting from some commit in 5.17, it seems (1) is fixed (unsure), but
>>> definitely not (2).
>>>
>>> I have bisected the kernel between 5.15 and 5.16, and found that the offending
>>> commit is 0e8ae5a6ff5952253cd7cc0260df838ab4c21009 ("PCI/portdrv: Do not setup
>>> up IRQs if there are no users"). Bisection log attached.
>>>
>>> Reverting this commit on linux-git[1] fixes both (1) and (2).
>>>
>>> Important notes:
>>> - This regression was reported on a DELL XPS 9550 laptop by two users [2], so
>>> it might be related strictly to that model.
>>> - According to user mallocman, the issue can also be fixed by reverting the
>>> BIOS version of the laptop to v1.12.
>>> - The issue ONLY occurs when AC is plugged in (and stays there even when I
>>> unplug it).
>>> - When booting on battery power, there is no issue at all.
>>>
>>> You can easily observe the regression via:
>>>
>>> watch cat /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_cur_fre
>>>
>>> As soon as I plug in AC, all frequencies go up to values around 3248338 and
>>> stay there even if I unplug AC. This does not happen at all when booted on
>>> battery power.
>>>
>>> Also note:
>>> - the laptop's fans are not really affected by the high frequencies.
>>> - setting the governor to "powersave" has no effect on the frequencies (as
>>> compared to when on battery power).
>>> - lowering the maximum frequency manually works, but does not fix (1).
>>>
>>> [1] https://aur.archlinux.org/pkgbase/linux-git/ (pulled commits up to
>>> 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8).
>>> [2] https://bbs.archlinux.org/viewtopic.php?id=273330
>
> I hope we can find a better solution, but since the responsiveness
> issue is a significant regression, I queued up a revert of
> 0e8ae5a6ff59 ("PCI/portdrv: Do not setup up IRQs if there are no
> users") in case we don't find one.

Likely best for now.

>
> If/when we get to the bottom of this, I'll replace the revert with the
> solution. 0e8ae5a6ff59 appeared in v5.16, so we'll have to make sure
> we fix that as well.

If you could give some feedback/hints on the questions I posted last
week on the original patch, that might accelerate understanding the real
issue.

Thanks,
Jan

--
Siemens AG, Technology
Competence Center Embedded Linux