Re: Linux 6.11-rc1

From: Jens Axboe
Date: Tue Jul 30 2024 - 14:55:57 EST


On 7/30/24 12:35 PM, Jens Axboe wrote:
> On 7/30/24 12:22 PM, Guenter Roeck wrote:
>> On 7/30/24 10:20, Jens Axboe wrote:
>>> On 7/30/24 11:04 AM, Guenter Roeck wrote:
>>>> On Mon, Jul 29, 2024 at 08:29:20AM -0700, Guenter Roeck wrote:
>>>>> On Sun, Jul 28, 2024 at 02:40:01PM -0700, Linus Torvalds wrote:
>>>>>> The merge window felt pretty normal, and the stats all look pretty
>>>>>> normal too. I was expecting things to be quieter because of summer
>>>>>> vacations, but that (still) doesn't actually seem to have been the
>>>>>> case.
>>>>>>
>>>>>> There's 12k+ regular commits (and another 850 merge commits), so as
>>>>>> always the summary of this all is just my merge log. The diffstats are
>>>>>> also (once again) dominated by some big hardware descriptions (another
>>>>>> AMD GPU register dump accounts for ~45% of the lines in the diff, and
>>>>>> some more perf event JSON descriptor files account for another 5%).
>>>>>>
>>>>>> But if you ignore those HW dumps, the diff too looks perfectly
>>>>>> regular: drivers account for a bit over half (even when not counting
>>>>>> the AMD register description noise). The rest is roughly one third
>>>>>> architecture updates (lots of it is dts files, so I guess I could have
>>>>>> lumped that in with "more hw descriptor tables"), one third tooling
>>>>>> and documentation, and one third "core kernel" (filesystems,
>>>>>> networking, VM and kernel). Very roughly.
>>>>>>
>>>>>> If you want more details, you should get the git tree, and then narrow
>>>>>> things down based on interests.
>>>>>>
>>>>>
>>>>> Build results:
>>>>> total: 158 pass: 139 fail: 19
>>>>> Failed builds:
>>>> ...
>>>>> i386:q35:pentium3:defconfig:pae:nosmp:net=ne2k_pci:initrd
>>>>
>>>> This failure bisects to commit 0256994887d7 ("Merge tag
>>>> 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux"). I have no
>>>> idea why that would be the case, but it is easy to reproduce. Maybe it is
>>>> coincidental. Either case, copying Jens in case he has an idea.
>>>
>>> I can take a look, but please post some details on what is actually
>>> being run here so I can attempt to reproduce it. I looked at your
>>> initial email too, and there's a link in there to:
>>>
>>> https://kerneltests.org/builders
>>>
>>> but I'm still not sure what's being run.
>>>
>>
>> Please see http://server.roeck-us.net/qemu/x86-nosmp/
>
> Works fine for me on current master, boots and run self tests and
> then shuts down. Tried it 5 times now.
>
> axboe@r7625 ~/g/linux-vm (master)> qemu-system-i386 --version
> QEMU emulator version 8.2.4 (Debian 1:8.2.4+ds-1)
> Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
>
> Then tried 6.11-rc1 10 times in a loop, and also didn't see any failures.
>
> I then switched to using gcc-11 as that seems to be what you are using,
> and them it does indeed bomb during boot. Funky. I'll check the post
> branch and see if it's anything from there.

I can fully revert that for-6.11/block-post merge and it still crashes
in the same way for me. So don't believe that's the culprit. It
consistently crashes with a double fault when starting cryptomgr, so
that may be a clue.

FWIW, if I disable KFENCE, then it boots just fine with gcc-11. Or if I
use gcc 13 or 14 it works just fine regardless of whether KFENCE is set
or not.

--
Jens Axboe