Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

From: Michael Ellerman
Date: Mon May 22 2023 - 03:41:33 EST


Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx> writes:
> * Alexey Kardashevskiy <aik@xxxxxxxxx> [2023-04-13 22:09:22]:
>
>> > > On 23.03.23 10:53, Srikar Dronamraju wrote:
>> > > >
>> > > > I am unable to boot upstream kernels from v5.16 to the latest upstream
>> > > > kernel on a maxconfig system. (Machine config details given below)
>> > > >
>> > > > At boot, we see a series of messages like the below.
>> > > >
>> > > > dracut-initqueue[13917]: Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks:
>> > > > dracut-initqueue[13917]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f93dc0767-18aa-467f-afa7-5b4e9c13108a.sh: "if ! grep -q After=remote-fs-pre.target /run/systemd/generator/systemd-cryptsetup@*.service 2>/dev/null; then
>> > > > dracut-initqueue[13917]: [ -e "/dev/disk/by-uuid/93dc0767-18aa-467f-afa7-5b4e9c13108a" ]
>> > > > dracut-initqueue[13917]: fi"
>> > >
>> > > Alexey, did you look into this? This is apparently caused by a commit of
>> > > yours (see quoted part below) that Michael applied. Looks like it fell
>> > > through the cracks from here, but maybe I'm missing something.
>> >
>> > Unfortunately Alexey is not working at IBM any more, so he won't have
>> > access to any hardware to debug/test this.
>> >
>> > Srikar are you debugging this? If not we'll have to find someone else to
>> > look at it.
>>
>> Has this been fixed and I missed cc:? Anyway, without the full log, I still
>> see it is a huge guest so chances are the guest could not map all RAM so
>> instead it uses the biggest possible DDW with 2M pages. If that's the case,
>> this might help it:
>>
>
> Hi Alexey, Michael
>
> Sorry for the late reply, but I didnt have access to this large system.
> This weekend, I did get access and tested with the patch. However it didn't
> help much, system is still stuck at dracut with similar message except the
> trace.
>
> However this patch
> https://lore.kernel.org/all/20230418204401.13168-1-gbatra@xxxxxxxxxxxxxxxxxx/
> from Gaurav Batra does solve this issue.

Thanks.

There was a v3 of that patch:
https://lore.kernel.org/all/20230504175913.83844-1-gbatra@xxxxxxxxxxxxxxxxxx/

Which is merged now into mainline as:
096339ab84f3 ("powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs")

Presumably it also fixes the bug for you, so I'll mark this as fixed,
but if you can test that exact commit that would be good to confirm the
bug is fixed in mainline.

cheers


#regzbot fixed-by: 096339ab84f3