Re: [Bug 216859] New: PCI bridge to bus boot hang at enumeration
From: Bjorn Helgaas
Date: Fri Jan 20 2023 - 00:30:21 EST
[+cc bjorn@xxxxxxxxxxx to avoid spamassassin]
On Wed, Jan 18, 2023 at 06:04:58PM -0600, Bjorn Helgaas wrote:
> On Fri, Jan 06, 2023 at 05:42:33PM +0100, Zeno Davatz wrote:
> > On Fri, Dec 30, 2022 at 7:50 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > > On Wed, Dec 28, 2022 at 12:42:34PM -0600, Bjorn Helgaas wrote:
> > > > On Wed, Dec 28, 2022 at 06:42:38PM +0100, Zeno Davatz wrote:
> > > > > On Wed, Dec 28, 2022 at 1:02 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > > > > > On Wed, Dec 28, 2022 at 08:37:52AM +0000, bugzilla-daemon@xxxxxxxxxx wrote:
> > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216859
> > > > > >
> > > > > > > Summary: PCI bridge to bus boot hang at enumeration
> > > > > > > Kernel Version: 6.1-rc1
> > > > > > > ...
> > > > > >
> > > > > > > With Kernel 6.1-rc1 the enumeration process stopped working for me,
> > > > > > > see attachments.
> > > > > > >
> > > > > > > The enumeration works fine with Kernel 6.0 and below.
> > > > > > >
> > > > > > > Same problem still exists with v6.1. and v6.2.-rc1
> > > > > >
> > > > > > Thank you very much for your report, Zeno!
> > > > > >
> > > > > > v6.0 works, v6.1-rc1 fails. Would you mind booting v6.1-rc1 with the
> > > > > > "ignore_loglevel initcall_debug" kernel parameters and taking a photo
> > > > > > when it hangs?
> > > > >
> > > > > I will try this after Januar 7th 2023.
> >
> > I updated the issue:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=216859
> >
> > I booted with the option: "ignore_loglevel initcall_debug"
>
> Thanks! There's so much pcie output in that picture that we can't see
> any of the initcall logging. Can you capture another movie, but use
> kernel parameters like "ignore_loglevel initcall_debug boot_delay=100"
> to slow things down? The full-speed boot is too fast for the camera
> to capture all the output. You can do this on any convenient kernel
> that hangs.
Thanks for the new movie! The last initcalls I see before the hang
are:
init_mqueue_fs
key_proc_init
jent_mod_init
We must have returned from jent_mod_init() because I think the "saving
config space" messages we see at the hang are from
pcie_portdrv_init().
I built 833477fce7a1 ("Merge tag 'sound-6.1-rc1' of
git://git.kernel.org/pub/scl) with your .config and when I boot it on
qemu, I see this:
calling jent_mod_init+0x0/0x32 @ 1
initcall jent_mod_init+0x0/0x32 returned 0 after 27185 usecs
calling af_alg_init+0x0/0x45 @ 1
NET: Registered PF_ALG protocol family
...
calling sg_pool_init+0x0/0xb4 @ 1
initcall sg_pool_init+0x0/0xb4 returned 0 after 462 usecs
calling pcie_portdrv_init+0x0/0x43 @ 1
pcieport 0000:00:1c.0: vgaarb: pci_notify
pcieport 0000:00:1c.0: runtime IRQ mapping not provided by arch
pcieport 0000:00:1c.0: enabling bus mastering
pcieport 0000:00:1c.0: PME: Signaling with IRQ 24
pcieport 0000:00:1c.0: AER: enabled with IRQ 24
pcieport 0000:00:1c.0: saving config space at offset 0x0 (reading 0x34208086)
pcieport 0000:00:1c.0: saving config space at offset 0x4 (reading 0x100507)
pcieport 0000:00:1c.0: saving config space at offset 0x8 (reading 0x6040002)
...
Would you mind trying again with "boot_delay=1000 pcie_ports=compat"?
"boot_delay=1000" should slow it down more (all the action is in the
last 3 seconds and it's still hard to see) and "pcie_ports=compat"
should turn off the PCIe port driver.
Bjorn