Re: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade

From: Mika Westerberg
Date: Wed May 24 2017 - 07:16:19 EST


On Tue, May 23, 2017 at 05:30:43PM +0000, Mario.Limonciello@xxxxxxxx wrote:
> (Sorry my email client is not going to wrap these at 80 columns)o

That's fine. It is more readable this way :)

> [ 0.467319] pci 0000:00:1c.0: [8086:9d10] type 01 class 0x060400
> [ 0.467389] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> [ 0.467513] pci 0000:00:1c.0: System wakeup disabled by ACPI

[...]

> [ 0.469363] pci 0000:01:00.0: [8086:1576] type 01 class 0x060400
> [ 0.469483] pci 0000:01:00.0: supports D1 D2
> [ 0.469484] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> [ 0.469570] pci 0000:01:00.0: System wakeup disabled by ACPI
> [ 0.469609] pci 0000:00:1c.0: PCI bridge to [bus 01-39]
> [ 0.469614] pci 0000:00:1c.0: bridge window [mem 0xc4000000-0xda0fffff]
> [ 0.469618] pci 0000:00:1c.0: bridge window [mem 0xa0000000-0xc1ffffff 64bit pref]
> [ 0.469621] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring

This is the problem. Here the PCIe upstream port (0000:01:00.0) is
visible to Linux but it is not fully configured by the BIOS ->
(primary/secondary/subordinate) is set to 0.

At this point Linux decides to configure the port itself and goes wrong
since our allocation strategy tries to keep resource windows, including
reserved buses as small as possible so that everything we currently find
barely fits there.

This continues few lines below:

> [ 0.469670] pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01-39] (conflicts with (null) [bus 01-39])
> [ 0.469688] pci 0000:02:00.0: [8086:1576] type 01 class 0x060400
> [ 0.469809] pci 0000:02:00.0: supports D1 D2
> [ 0.469810] pci 0000:02:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> [ 0.469877] pci 0000:02:01.0: [8086:1576] type 01 class 0x060400
> [ 0.470000] pci 0000:02:01.0: supports D1 D2
> [ 0.470001] pci 0000:02:01.0: PME# supported from D0 D1 D2 D3hot D3cold
> [ 0.470067] pci 0000:02:02.0: [8086:1576] type 01 class 0x060400
> [ 0.470188] pci 0000:02:02.0: supports D1 D2
> [ 0.470189] pci 0000:02:02.0: PME# supported from D0 D1 D2 D3hot D3cold
> [ 0.470277] pci 0000:01:00.0: PCI bridge to [bus 02-ff]
> [ 0.470283] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff]
> [ 0.470287] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff]
> [ 0.470294] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
> [ 0.470296] pci 0000:02:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [ 0.470304] pci 0000:02:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [ 0.470312] pci 0000:02:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring

Here.

And ends up in failure when we create PCIe tunnels later on.

Now, this is probably where Windows does something else, like it may
skip re-configuring phase which could explain why it works. However, to
me this looks pretty much like a bug in the BIOS/firmware as we are
expecting the BIOS to configure the PCIe devices properly before the OS
is send ACPI hotplug event.

We need to handle this in Linux in the same way Windows does but
currently I have no idea. It is however, more related to our PCI
enumeration code than the patches in question, I think.

I also have a Dell 9350 here so I can reproduce the problem and I'm
going to investigate this further probably involving Linux PCI people.

My testing on the machine shows this behaviour only when the cable is
connected during boot.

If I connect the cable after OS is booted I don't see the problem, even
if I do unplug / plug cycle.

Can you try that also (again)? And if you see the problem, send me the
dmesg? I have the latest BIOS (1.4.17) and NVM 16 so this machine
configuration should match yours if I'm not mistaken.