On 29.08.23 13:28, Linux regression tracking (Thorsten Leemhuis) wrote:
On 11.07.23 13:16, Shreeya Patel wrote:Still no reply. :-/
On 10/07/23 17:39, Linux regression tracking (Thorsten Leemhuis) wrote:That was weeks ago and we didn't get a answer. :-/ Was this fixed in
Hi, Thorsten here, the Linux kernel's regression tracker. Top-postingI still see the regression happening so it doesn't seem to be fixed.
for once, to make this easily accessible to everyone.
Shreeya Patel, Masahiro Yamada: what's the status of this? Was any
progress made to address this? Or is this maybe (accidentally?) fixed
with 6.5-rc1?
https://linux.kernelci.org/test/case/id/64ac675a8aebf63753bb2a8c/
Masahiro had submitted a fix for this issue here.
https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@xxxxxxxxxxxxxxxxxx/T/#t
But I don't see any movement there. Masahiro, are you planning to send a
v2 for it?
between? Doesn't look like it from here, but I might be missing something.
Shreeya Patel, does the problem still happen with 6.6-rc1 and do you
still want to see it fixed? In that case our only option to get things
rolling again might be to involve Linus, unless someone in the CC list
has a idea to resolve this. Might also be good to know if reverting the
culprit fixes the problem.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 20.06.23 06:19, Masahiro Yamada wrote:
On Mon, Jun 12, 2023 at 7:10 PM Shreeya Patel
<shreeya.patel@xxxxxxxxxxxxx> wrote:
On 24/05/23 02:57, Nick Desaulniers wrote:Reverting 5750121ae738 does not solve the issue
On Tue, May 23, 2023 at 3:27 AM Shreeya PatelI'm not sure if you followed the conversation but we are still seeing
<shreeya.patel@xxxxxxxxxxxxx> wrote:
Hi Nick and Masahiro,Thanks!
On 23/05/23 01:22, Nick Desaulniers wrote:
On Mon, May 22, 2023 at 9:52 AM Greg KHHere are the links to Lava jobs ran with initcall_debug added to the
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Mon, May 22, 2023 at 12:09:34PM +0200, Ricardo Cañuelo wrote:Ah, right it was the initcall ordering. Thanks for the reminder.
On vie, may 19 2023 at 08:57:24, Nick DesaulniersYes, it matters, you can not change it. If you do, systems will
<ndesaulniers@xxxxxxxxxx> wrote:
It could be; if the link order was changed, it's possible thatI thought that was specifically a C++ problem? But then again, the
this
target may be hitting something along the lines of:
https://isocpp.org/wiki/faq/ctors#static-init-order i.e. the
"static
initialization order fiasco"
I'm struggling to think of how this appears in C codebases, but I
swear years ago I had a discussion with GKH (maybe?) about
this. I
think I was playing with converting Kbuild to use Ninja rather
than
Make; the resulting kernel image wouldn't boot because I had
modified
the order the object files were linked in. If you were to
randomly
shuffle the object files in the kernel, I recall some hazard
that may
prevent boot.
kernel docs explicitly say that the ordering of obj-y goals in
kbuild is
significant in some instances [1]:
break.
It is the only way we have of properly ordering our init calls
within
the same "level".
(There's a joke in there similar to the use of regexes to solve a
problem resulting in two new problems; initcalls have levels for
ordering, but we still have (unexpressed) dependencies between calls
of the same level; brittle!).
+Maksim, since that might be relevant info for the BOLT+Kernel work.
Ricardo,
https://elinux.org/images/e/e8/2020_ELCE_initcalls_myjosserand.pdf
mentions that there's a kernel command line param `initcall_debug`.
Perhaps that can be used to see if
5750121ae7382ebac8d47ce6d68012d6cd1d7926 somehow changed initcall
ordering, resulting in a config that cannot boot?
kernel command line.
1. Where regression happens
(5750121ae7382ebac8d47ce6d68012d6cd1d7926)
https://lava.collabora.dev/scheduler/job/10417706
<https://lava.collabora.dev/scheduler/job/10417706>
2. With a revert of the commit
5750121ae7382ebac8d47ce6d68012d6cd1d7926
https://lava.collabora.dev/scheduler/job/10418012
<https://lava.collabora.dev/scheduler/job/10418012>
Yeah, I can see a diff in the initcall ordering as a result of
commit 5750121ae738 ("kbuild: list sub-directories in ./Kbuild")
https://gist.github.com/nickdesaulniers/c09db256e42ad06b90842a4bb85cc0f4
Not just different orderings, but some initcalls seem unique to the
before vs. after, which is troubling. (example init_events and
init_fs_sysctls respectively)
That isn't conclusive evidence that changes to initcall ordering are
to blame, but I suspect confirming that precisely to be very very time
consuming.
Masahiro, what are your thoughts on reverting 5750121ae738? There are
conflicts in Kbuild and Makefile when reverting 5750121ae738 on
mainline.
this regression with the latest kernel builds and would like to know if
you plan to revert 5750121ae738?
because the issue happens even before 5750121ae738.
multi_v7_defconfig + debug.config + CONFIG_MODULES=n
fails to boot in the same way.
The revert would hide the issue on a particular build setup.
I submitted a patch to more pin-point the issue.
Let's see how it goes.
https://lore.kernel.org/lkml/ZJEni98knMMkU%2Fcl@xxxxxxxxxxxxxxxxxx/T/#t
(BTW, the initcall order is unrelated)
Thanks,--
Shreeya Patel
Thanks,
Shreeya Patel
Best Regards
Masahiro Yamada