Re: Linux regressions report for mainline [2023-10-29]

From: Linus Torvalds
Date: Sun Oct 29 2023 - 22:53:20 EST


On Sun, 29 Oct 2023 at 16:18, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
>
> We are investigating and hope the simpledrm problem can be fixed in
> some days [1],

I don't understand your "some days". The original report was two+
weeks ago, and the link you point to does not seem to have a suggested
patch for the problem either.

So where does the "some days" come from?

The WHOLE POINT of the "no regressions" rule - and the reason it came
to be in the first place - was that we used to have these endless "one
step forward, two steps back" things with suspend/resume in
particular, where people fixed one device, but then broke a random
number of other devices, and kept saying " but I fixed something".

No. If you broke something else, YOU DIDN'T FIX ANYTHING AT ALL.

This is literally why we have that "no regressions" rule. No amount of
"but it's a fix" is valid at all if something else breaks. And no
amount of "I will fix the thing I broke in the future" is valid
either.

If you don't have a fix for it, it's broken. And I don't even see a
*suggested* fix for people to try out.

> and the blank screen seems not a very harmful problem
> (maybe I'm wrong but I think most of people are using GUI now). So,
> can we keep the commit 60aebc9559492c at this time?

At least the email from Evan Preston seems to imply it's a blank
screen that doesn't go away.

"Upgrading from Linux 6.4.12 to 6.5 and later results in only a
blank screen after boot and a rapidly flashing device-access-status
indicator"

And no, "most people using GUI" doesn't matter. You are supposed to be
able to upgrade your working kernel, and it's supposed to keep
working. That's *important*, because it's really really important that
people *trust* that they can upgrade the kernel and not end up with
something non-working, because that's how people then dare do kernel
updates and dare test new kernels.

If people then stop testing new kernels because they think new kernels
might break their setup, we have lost something truly important.

And yes, there are always exceptions. At some point, devices are just
too old legacy and there is no way of testing. Or we've had some
interface that was *so* mis-designed that it was a fundamental
security issue or something like that.

But no, this does not seem to be one of those issues.

Now, I'm not going to revert it just before releasing v6.6 (which I
have locally tagged, but not pushed out yet). And I'll have the merge
window for 6.7 opening tomorrow. But if this is not fixed by -rc1,
we'll just revert it.

Linus