Re: [regression] 6.8.1: fails to hibernate with pm_runtime_force_suspend+0x0/0x120 returns -16

From: Martin Steigerwald
Date: Tue Apr 02 2024 - 15:43:10 EST


Hi Thorsten, hi,

Linux regression tracking (Thorsten Leemhuis) - 19.03.24, 09:40:06 CEST:
> On 16.03.24 17:12, Martin Steigerwald wrote:
> > Martin Steigerwald - 16.03.24, 17:02:44 CET:
> >> ThinkPad T14 AMD Gen 1 fails to hibernate with self-compiled 6.8.1.
> >> Hibernation works correctly with self-compiled 6.7.9.
> >
> > Apparently 6.8.1 does not even reboot correctly anymore. runit on
> > Devuan. It says it is doing the system reboot but then nothing
> > happens.
> >
> > As for hibernation the kernel cancels the attempt and returns back to
> > user space desktop session.
> >
> >> Trying to use "no_console_suspend" to debug next. Will not do bisect
> >> between major kernel releases on a production machine.
>
> FWIW, without a bisection I guess no developer will take a closer look
> (but I might be wrong and you lucky here!), as any change in those
> hundreds of drivers used on that machine can possibly lead to problems
> like yours. So without a bisection we are likely stuck here, unless
> someone else runs into the same problem and bisects or fixes it. Sorry,
> but that's just how it is.

I have been asked this repeatedly with previous bug reports. My issue
with bisecting between major kernel versions is this:

When I look around here I see no second ThinkPad T14 AMD Gen 1 here I
could use for testing. Also doing a kernel bisect using a GRML live iso…
not really.

The one I reported this from is a production machine with a 4 TB NVMe
SSD which contains a lot of data. I am not willing to risk data loss or
(silent) file system corruption by bisecting between major kernel
releases. Bisecting between major kernel releases in my understanding
would require to test various releases between in this example 6.7 and
6.8 and even between 6.7 and 6.8-rc1. At least in my understand anything
between 6.7 and 6.8-rc1 is not guaranteed to be even be somewhat stable. I
am not usually installing an rc1 kernel on a production machine, but
rather wait for at least rc2/3 nowadays. Its a balanced risk calculation.
And rc2/3 or later appears to be a risk I am willing to take. But
something between stable and rc1? Nope.

It is not even that rare. 6.7 some rc failed with hibernation as well.
With exactly the same machine. I refused to do a bisect as well in that
case. At some later time the issue was fixed without me doing anything
more.

Now my question is this: Without me willing to bisect in that case, is
a bug report even useful? Otherwise I may just switch this last machine
to distribution kernels. It would save a lot of time for me. This private
and freelancer production machine is the last left-over machine with self-
compiled kernels.

So far I still thought I would somehow be contributing to Linux kernel
quality with detailed bug reports that take time to write, but apparently
I am not. Can you clarify?

Ciao,
--
Martin