Re: [PATCH v4] Makefile.compiler: replace cc-ifversion with compiler-specific macros

From: Nick Desaulniers
Date: Fri May 19 2023 - 11:57:59 EST


On Fri, May 19, 2023 at 1:35 AM Ricardo Cañuelo
<ricardo.canuelo@xxxxxxxxxxxxx> wrote:
>
> On jue, may 18 2023 at 14:12:30, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
> > That's a higher risk change (and has my name on the tested-by tag, yikes).
> >
> > So is that the culprit of the boot failure you're observing?
>
> Right now it is.
>
> Here's a test run using that commit
> (5750121ae7382ebac8d47ce6d68012d6cd1d7926):
> https://lava.collabora.dev/scheduler/job/10373216
>
> Here's one with the commit right after that one
> (26ef40de5cbb24728a34a319e8d42cdec99f186c):
> https://lava.collabora.dev/scheduler/job/10371513
>
> Then one with 26ef40de5cbb24728a34a319e8d42cdec99f186c with a revert
> commit for 5750121ae7382ebac8d47ce6d68012d6cd1d7926 on top:
> https://lava.collabora.dev/scheduler/job/10371882
>
> But I'm not confident enough to jump ahead and call this a kernel
> regression, specially after the bisector confidently said that about
> your commit and then it turned out none of us could reproduce it.

It could be; if the link order was changed, it's possible that this
target may be hitting something along the lines of:
https://isocpp.org/wiki/faq/ctors#static-init-order i.e. the "static
initialization order fiasco"

I'm struggling to think of how this appears in C codebases, but I
swear years ago I had a discussion with GKH (maybe?) about this. I
think I was playing with converting Kbuild to use Ninja rather than
Make; the resulting kernel image wouldn't boot because I had modified
the order the object files were linked in. If you were to randomly
shuffle the object files in the kernel, I recall some hazard that may
prevent boot.

>
> There have been some cases where a commit made a test fail (kernel
> failing to load, for instance) and the real problem was that the kernel
> got bigger than the target was capable of handling. So not a problem
> with the commit at all, it was just that the memory mappings needed to
> be redefined for that target. What I'm saying is that sometimes a
> regression report is really uncovering a problem in the test setup
> rather than introducing a bug. Maybe this is one of those cases.
>
> Cheers,
> Ricardo



--
Thanks,
~Nick Desaulniers