Re: [PATCH v4] Makefile.compiler: replace cc-ifversion with compiler-specific macros

From: Shreeya Patel
Date: Tue Jul 11 2023 - 07:16:27 EST



On 10/07/23 17:39, Linux regression tracking (Thorsten Leemhuis) wrote:
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Shreeya Patel, Masahiro Yamada: what's the status of this? Was any
progress made to address this? Or is this maybe (accidentally?) fixed
with 6.5-rc1?

Hi Thorsten,

I still see the regression happening so it doesn't seem to be fixed.
https://linux.kernelci.org/test/case/id/64ac675a8aebf63753bb2a8c/

Masahiro had submitted a fix for this issue here.

https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@xxxxxxxxxxxxxxxxxx/T/#t

But I don't see any movement there. Masahiro, are you planning to send a v2 for it?


Thanks,
Shreeya Patel

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 20.06.23 06:19, Masahiro Yamada wrote:
On Mon, Jun 12, 2023 at 7:10 PM Shreeya Patel
<shreeya.patel@xxxxxxxxxxxxx> wrote:
On 24/05/23 02:57, Nick Desaulniers wrote:
On Tue, May 23, 2023 at 3:27 AM Shreeya Patel
<shreeya.patel@xxxxxxxxxxxxx> wrote:
Hi Nick and Masahiro,

On 23/05/23 01:22, Nick Desaulniers wrote:
On Mon, May 22, 2023 at 9:52 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Mon, May 22, 2023 at 12:09:34PM +0200, Ricardo Cañuelo wrote:
On vie, may 19 2023 at 08:57:24, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
It could be; if the link order was changed, it's possible that this
target may be hitting something along the lines of:
https://isocpp.org/wiki/faq/ctors#static-init-order i.e. the "static
initialization order fiasco"

I'm struggling to think of how this appears in C codebases, but I
swear years ago I had a discussion with GKH (maybe?) about this. I
think I was playing with converting Kbuild to use Ninja rather than
Make; the resulting kernel image wouldn't boot because I had modified
the order the object files were linked in. If you were to randomly
shuffle the object files in the kernel, I recall some hazard that may
prevent boot.
I thought that was specifically a C++ problem? But then again, the
kernel docs explicitly say that the ordering of obj-y goals in kbuild is
significant in some instances [1]:
Yes, it matters, you can not change it. If you do, systems will break.
It is the only way we have of properly ordering our init calls within
the same "level".
Ah, right it was the initcall ordering. Thanks for the reminder.

(There's a joke in there similar to the use of regexes to solve a
problem resulting in two new problems; initcalls have levels for
ordering, but we still have (unexpressed) dependencies between calls
of the same level; brittle!).

+Maksim, since that might be relevant info for the BOLT+Kernel work.

Ricardo,
https://elinux.org/images/e/e8/2020_ELCE_initcalls_myjosserand.pdf
mentions that there's a kernel command line param `initcall_debug`.
Perhaps that can be used to see if
5750121ae7382ebac8d47ce6d68012d6cd1d7926 somehow changed initcall
ordering, resulting in a config that cannot boot?
Here are the links to Lava jobs ran with initcall_debug added to the
kernel command line.

1. Where regression happens (5750121ae7382ebac8d47ce6d68012d6cd1d7926)
https://lava.collabora.dev/scheduler/job/10417706
<https://lava.collabora.dev/scheduler/job/10417706>

2. With a revert of the commit 5750121ae7382ebac8d47ce6d68012d6cd1d7926
https://lava.collabora.dev/scheduler/job/10418012
<https://lava.collabora.dev/scheduler/job/10418012>
Thanks!

Yeah, I can see a diff in the initcall ordering as a result of
commit 5750121ae738 ("kbuild: list sub-directories in ./Kbuild")

https://gist.github.com/nickdesaulniers/c09db256e42ad06b90842a4bb85cc0f4

Not just different orderings, but some initcalls seem unique to the
before vs. after, which is troubling. (example init_events and
init_fs_sysctls respectively)

That isn't conclusive evidence that changes to initcall ordering are
to blame, but I suspect confirming that precisely to be very very time
consuming.

Masahiro, what are your thoughts on reverting 5750121ae738? There are
conflicts in Kbuild and Makefile when reverting 5750121ae738 on
mainline.
I'm not sure if you followed the conversation but we are still seeing
this regression with the latest kernel builds and would like to know if
you plan to revert 5750121ae738?

Reverting 5750121ae738 does not solve the issue
because the issue happens even before 5750121ae738.
multi_v7_defconfig + debug.config + CONFIG_MODULES=n
fails to boot in the same way.

The revert would hide the issue on a particular build setup.


I submitted a patch to more pin-point the issue.
Let's see how it goes.
https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@xxxxxxxxxxxxxxxxxx/T/#t


(BTW, the initcall order is unrelated)






Thanks,
Shreeya Patel

Thanks,
Shreeya Patel

--
Best Regards
Masahiro Yamada