Re: Linux 6.11-rc1

From: Linus Torvalds
Date: Tue Jul 30 2024 - 19:55:12 EST


On Tue, 30 Jul 2024 at 16:29, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> Baffled. Is it possible that the crashing code catches some page boundary ?

We've definitely seen things like that before. Some alignment change
makes something cross a cacheline or page boundary, and it magically
causes a huge regression.

Usually it's about performance, though, not this kind of thing.

But I could imagine that some odd instruction rewriting thing goes
wrong only when the instruction crosses a page boundary, and that
we've never happened to hit that case, and then some kernel config
just moves the affected code around just enough.

That would then indirectly also explain why only some compiler
versions hit it - because it all depends on hitting that exact page
crosser.

You also seemed to say that it only happened with some CPU selections.
Maybe there's something wrong with the ALTERNATIVE() cleanups - I'm
looking at that new "nested alternatives macros" thing, and the odd
games we play with the origin and replacement lengths etc.

That all looks entirely crazy. That file was hard to read before, now
it's just incomprehensible to me.

Linus