Re: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade
From: Mika Westerberg
Date: Thu May 25 2017 - 03:19:33 EST
On Wed, May 24, 2017 at 07:06:33PM +0000, Mario.Limonciello@xxxxxxxx wrote:
> > -----Original Message-----
> > From: Mika Westerberg [mailto:mika.westerberg@xxxxxxxxxxxxxxx]
> > Sent: Wednesday, May 24, 2017 6:11 AM
> > To: Limonciello, Mario <Mario_Limonciello@xxxxxxxx>
> > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; andreas.noever@xxxxxxxxx;
> > michael.jamet@xxxxxxxxx; yehezkel.bernat@xxxxxxxxx; lukas@xxxxxxxxx;
> > amir.jer.levy@xxxxxxxxx; luto@xxxxxxxxxx; Dominguez, Jared
> > <Jared_Dominguez@xxxxxxxx>; andriy.shevchenko@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade
> >
> > On Tue, May 23, 2017 at 05:30:43PM +0000, Mario.Limonciello@xxxxxxxx wrote:
> > > (Sorry my email client is not going to wrap these at 80 columns)o
> >
> > That's fine. It is more readable this way :)
> >
> > > [ 0.467319] pci 0000:00:1c.0: [8086:9d10] type 01 class 0x060400
> > > [ 0.467389] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> > > [ 0.467513] pci 0000:00:1c.0: System wakeup disabled by ACPI
> >
> > [...]
> >
> > > [ 0.469363] pci 0000:01:00.0: [8086:1576] type 01 class 0x060400
> > > [ 0.469483] pci 0000:01:00.0: supports D1 D2
> > > [ 0.469484] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> > > [ 0.469570] pci 0000:01:00.0: System wakeup disabled by ACPI
> > > [ 0.469609] pci 0000:00:1c.0: PCI bridge to [bus 01-39]
> > > [ 0.469614] pci 0000:00:1c.0: bridge window [mem 0xc4000000-0xda0fffff]
> > > [ 0.469618] pci 0000:00:1c.0: bridge window [mem 0xa0000000-0xc1ffffff
> > 64bit pref]
> > > [ 0.469621] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]),
> > reconfiguring
> >
> > This is the problem. Here the PCIe upstream port (0000:01:00.0) is
> > visible to Linux but it is not fully configured by the BIOS ->
> > (primary/secondary/subordinate) is set to 0.
>
> So at least for me the other difference between a successful run (where you plug
> in after boot instead) is that it shows up as instead:
> PCI bridge to [bus 02-39]
>
> Same bridge window though.
>
> >
> > At this point Linux decides to configure the port itself and goes wrong
> > since our allocation strategy tries to keep resource windows, including
> > reserved buses as small as possible so that everything we currently find
> > barely fits there.
> >
> > This continues few lines below:
> >
> > > [ 0.469670] pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01-
> > 39] (conflicts with (null) [bus 01-39])
> > > [ 0.469688] pci 0000:02:00.0: [8086:1576] type 01 class 0x060400
> > > [ 0.469809] pci 0000:02:00.0: supports D1 D2
> > > [ 0.469810] pci 0000:02:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> > > [ 0.469877] pci 0000:02:01.0: [8086:1576] type 01 class 0x060400
> > > [ 0.470000] pci 0000:02:01.0: supports D1 D2
> > > [ 0.470001] pci 0000:02:01.0: PME# supported from D0 D1 D2 D3hot D3cold
> > > [ 0.470067] pci 0000:02:02.0: [8086:1576] type 01 class 0x060400
> > > [ 0.470188] pci 0000:02:02.0: supports D1 D2
> > > [ 0.470189] pci 0000:02:02.0: PME# supported from D0 D1 D2 D3hot D3cold
> > > [ 0.470277] pci 0000:01:00.0: PCI bridge to [bus 02-ff]
> > > [ 0.470283] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff]
> > > [ 0.470287] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff]
> > > [ 0.470294] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff
> > 64bit pref]
> > > [ 0.470296] pci 0000:02:00.0: bridge configuration invalid ([bus 00-00]),
> > reconfiguring
> > > [ 0.470304] pci 0000:02:01.0: bridge configuration invalid ([bus 00-00]),
> > reconfiguring
> > > [ 0.470312] pci 0000:02:02.0: bridge configuration invalid ([bus 00-00]),
> > reconfiguring
> >
> > Here.
> >
> > And ends up in failure when we create PCIe tunnels later on.
>
> For what it's worth the XPS 9365 which has a different BIOS core has these
> exact same behaviors on Linux if booted with the TBT dock plugged in.
>
> >
> > Now, this is probably where Windows does something else, like it may
> > skip re-configuring phase which could explain why it works. However, to
> > me this looks pretty much like a bug in the BIOS/firmware as we are
> > expecting the BIOS to configure the PCIe devices properly before the OS
> > is send ACPI hotplug event.
> >
>
> I'll reach out to the BIOS guys to see if they can give some more comments
> from their perspective.
>
> I came across something interesting from browsing MSDN about this topic.
> It hasn't been updated in a long time but I think should still be a relevant
> indication of the approach that Windows was taking and why the firmware
> is this way and expecting OS to reconfigure.
>
> "The BIOS cannot preconfigure PCI-to-PCI (P2P) bridges on adapters during
> hot plug. Consequently, the operating system assigns resource windows of
> a default size to a bridge.
>
> I/O window. The default size for the I/O window is 4 KB in Windows 2000,
> Windows XP, and Windows Server 2003.
> Memory window. The configuration for the memory window differs for
> Windows 2000, Windows XP, and Windows Server 2003:
> * For Windows 2000, the default size for the memory window is 2 MB.
> * For Windows XP and Windows Server 2003, the operating system
> first attempts to find a memory window of 32 MB. If it cannot find a
> window of that size, the operating system attempts to find a memory
> window of progressively smaller sizes (16, 8, 4, 2, and finally 1 MB) until
> it finds a size that works."
I think the way current BIOS does it, is that it actually configures the
hotplugged bridge and assigns resources accordingly. Once that is done
it trigggers hotplug to the OS using ACPI event.
> > We need to handle this in Linux in the same way Windows does but
> > currently I have no idea. It is however, more related to our PCI
> > enumeration code than the patches in question, I think.
> >
>
> Come to think of it, I have seen the dock have troubles if plugged in at
> boot on Linux even with SL0 before this patch series.
>
> > I also have a Dell 9350 here so I can reproduce the problem and I'm
> > going to investigate this further probably involving Linux PCI people.
> To clarify are you reproducing it with a TB16 or some other TBT device?
I'm using a chain of 1 to 5 devices. I don't have TB16 here but I don't
think it matters here.
> > My testing on the machine shows this behaviour only when the cable is
> > connected during boot.
>
> Yep same.
>
> >
> > If I connect the cable after OS is booted I don't see the problem, even
> > if I do unplug / plug cycle.
> >
> > Can you try that also (again)? And if you see the problem, send me the
> > dmesg? I have the latest BIOS (1.4.17) and NVM 16 so this machine
> > configuration should match yours if I'm not mistaken.
>
> It does work properly if I boot no cable plugged in and then plug one in.
OK, so we see the same behavior.
To summarize: This happens only on boot when Thunderbolt device is
already connected.