Re: SCSI hotplug issues with UEFI VM with guest kernel >= 6.5
From: Igor Mammedov
Date: Tue Dec 12 2023 - 06:26:30 EST
On Mon, 11 Dec 2023 14:52:42 +0100
Fiona Ebner <f.ebner@xxxxxxxxxxx> wrote:
> Am 11.12.23 um 08:46 schrieb Igor Mammedov:
> > On Fri, 8 Dec 2023 16:47:23 +0100
> > Igor Mammedov <imammedo@xxxxxxxxxx> wrote:
> >
> >> On Thu, 7 Dec 2023 17:28:15 -0600
> >> Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >>
> >>>
> >>> What's the actual symptom that this is broken? All these log
> >>> fragments show the exact same assignments for BARs 0, 1, 4 and for the
> >>> bridge windows.
> >>>
>
> The disk never shows up in /dev
>
> >>> I assume 0000:01:02.0 is the hot-added SCSI HBA, and 00:05.0 is a
> >>> bridge leading to it?
> >>>
> >>> Can you put the complete dmesg logs somewhere? There's a lot of
> >>> context missing here.
> >>>
>
> Is this still necessary with Igor being able to reproduce the issue?
it's not necessary, but it would help to find out what's going wrong faster.
Otherwise we would need to fallback to debugging over email.
Are you willing to help with testing/providing debug logs to track down
the cause?.
Though debug over email would be slow, so our best option is to revert
offending patches until the cause if found/fixed.
> >>> Do you have to revert both cc22522fd55e2 and 40613da52b13f to make it
> >>> work reliably? If we have to revert something, reverting one would be
> >>> better than reverting both.
> >>
>
> Just reverting cc22522fd55e2 is not enough (and cc22522fd55e2 fixes
> 40613da52b13f so I can't revert just 40613da52b13f).
With UEFI setup, it still works for me fine with current master.
Kernel 6.7.0-rc5-00014-g26aff849438c on an x86_64 (ttyS0)
ibm-p8-kvm-03-guest-02 login: pci 0000:01:02.0: [1af4:1004] type 00 class 0x010000
pci 0000:01:02.0: reg 0x10: [io 0x0000-0x003f]
pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x00000fff]
pci 0000:01:02.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
pci 0000:01:02.0: BAR 4: assigned [mem 0x380000004000-0x380000007fff 64bit pref]
pci 0000:01:02.0: BAR 1: assigned [mem 0xc1001000-0xc1001fff]
pci 0000:01:02.0: BAR 0: assigned [io 0xc040-0xc07f]
pci 0000:00:05.0: PCI bridge to [bus 01]
pci 0000:00:05.0: bridge window [io 0xc000-0xcfff]
pci 0000:00:05.0: bridge window [mem 0xc1000000-0xc11fffff]
pci 0000:00:05.0: bridge window [mem 0x380000000000-0x3807ffffffff 64bit pref]
virtio-pci 0000:01:02.0: enabling device (0000 -> 0003)
scsi host3: Virtio SCSI HBA
pci 0000:00:05.0: PCI bridge to [bus 01]
pci 0000:00:05.0: bridge window [io 0xc000-0xcfff]
pci 0000:00:05.0: bridge window [mem 0xc1000000-0xc11fffff]
pci 0000:00:05.0: bridge window [mem 0x380000000000-0x3807ffffffff 64bit pref]
scsi 3:0:0:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5
sd 3:0:0:0: Power-on or device reset occurred
sd 3:0:0:0: Attached scsi generic sg2 type 0
sd 3:0:0:0: LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.
sd 3:0:0:0: [sdb] 5190784 512-byte logical blocks: (2.66 GB/2.47 GiB)
sd 3:0:0:0: [sdb] Write Protect is off
sd 3:0:0:0: [sdb] Mode Sense: 63 00 00 08
sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
GPT:Primary header thinks Alt. header is not at the end of the disk.
GPT:5190099 != 5190783
GPT:Alternate GPT header not at the end of the disk.
GPT:5190099 != 5190783
GPT: Use GNU Parted to correct GPT errors.
sdb: sdb1 sdb2
sd 3:0:0:0: [sdb] Attached SCSI disk
it still doesn't work with Fedora's 6.7.0-0.rc2.20231125git0f5cc96c367f.26.fc40.x86_64 kernel.
However it's necessary to have -smp 4 for it to break,
with -smp 1 it works fine as well.
> > Fiona,
> >
> > Does it help if you use q35 machine with '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' option?
> >
>
> Yes, it does :)
>
> I added the following to my QEMU commandline (first line, because there
> wouldn't be a "pci.0" otherwise):
>
> > -device 'pci-bridge,id=pci.0,chassis_nr=4' \
> > -machine 'q35' \
> > -global 'ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' \
>
> and while it takes a few seconds, the disk does show up successfully:
delay is normal for SHPC
>
> Best Regards,
> Fiona
>