Re: [PATCH v3 1/1] PCI: Fix bug resulting in double hpmemsize being assigned to MMIO window

From: Nicholas Johnson
Date: Mon Nov 18 2019 - 22:17:14 EST


On Mon, Nov 18, 2019 at 04:58:02PM -0600, Bjorn Helgaas wrote:
> On Mon, Nov 18, 2019 at 09:43:34AM +0000, Nicholas Johnson wrote:
> > On Thu, Nov 14, 2019 at 10:56:37AM -0600, Bjorn Helgaas wrote:
> > > On Wed, Nov 13, 2019 at 03:25:28PM +0000, Nicholas Johnson wrote:
> > > > Currently, the kernel can sometimes assign the MMIO_PREF window
> > > > additional size into the MMIO window, resulting in extra MMIO additional
> > > > size, despite the MMIO_PREF additional size being assigned successfully
> > > > into the MMIO_PREF window.
> > > >
> > > > This happens if in the first pass, the MMIO_PREF succeeds but the MMIO
> > > > fails. In the next pass, because MMIO_PREF is already assigned, the
> > > > attempt to assign MMIO_PREF returns an error code instead of success
> > > > (nothing more to do, already allocated). Hence, the size which is
> > > > actually allocated, but thought to have failed, is placed in the MMIO
> > > > window.
> > > >
> > > > Example of problem (more context can be found in the bug report URL):
> > > >
> > > > Mainline kernel:
> > > > pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0xa00fffff] = 256M
> > > > pci 0000:06:04.0: BAR 14: assigned [mem 0xa0200000-0xb01fffff] = 256M
> > > >
> > > > Patched kernel:
> > > > pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0x980fffff] = 128M
> > > > pci 0000:06:04.0: BAR 14: assigned [mem 0x98200000-0xa01fffff] = 128M
> > > >
> > > > This was using pci=realloc,hpmemsize=128M,nocrs - on the same machine
> > > > with the same configuration, with a Ubuntu mainline kernel and a kernel
> > > > patched with this patch.
> > > >
> > > > The bug results in the MMIO_PREF being added to the MMIO window, which
> > > > means doubling if MMIO_PREF size = MMIO size. With a large MMIO_PREF,
> > > > the MMIO window will likely fail to be assigned altogether due to lack
> > > > of 32-bit address space.
> > > >
> > > > Change find_free_bus_resource() to do the following:
> > > > - Return first unassigned resource of the correct type.
> > > > - If none of the above, return first assigned resource of the correct type.
> > > > - If none of the above, return NULL.
> > > >
> > > > Returning an assigned resource of the correct type allows the caller to
> > > > distinguish between already assigned and no resource of the correct type.
> > > >
> > > > Rename find_free_bus_resource to find_bus_resource_of_type().
> > > >
> > > > Add checks in pbus_size_io() and pbus_size_mem() to return success if
> > > > resource returned from find_free_bus_resource() is already allocated.
> > > >
> > > > This avoids pbus_size_io() and pbus_size_mem() returning error code to
> > > > __pci_bus_size_bridges() when a resource has been successfully assigned
> > > > in a previous pass. This fixes the existing behaviour where space for a
> > > > resource could be reserved multiple times in different parent bridge
> > > > windows.
> > > >
> > > > Link: https://lore.kernel.org/lkml/20190531171216.20532-2-logang@xxxxxxxxxxxx/T/#u
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=203243
> > > >
> > > > Reported-by: Kit Chow <kchow@xxxxxxxxxx>
> > > > Reported-by: Nicholas Johnson <nicholas.johnson-opensource@xxxxxxxxxxxxxx>
> > > > Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@xxxxxxxxxxxxxx>
> > > > Reviewed-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
> > >
> > > Applied with reviewed-by from Mika and Logan to pci/resource for v5.5,
> > > thanks!
> >
> > We have v5.4-rc8, so there is one more week. Please let me know if you
> > have any concerns about the other four patches so that I may address
> > them ASAP. If you are worried about the first one, I can re-post the
> > series with it at the end, so that the others can be taken first.
>
> I assume you're talking about this:
>
> [PATCH v11 0/4] Patch series to assist Thunderbolt allocation with kernel parameters
>
> I hope to merge those early in the next cycle so we get some time in
> linux-next for wider testing. It's later in the v5.5 cycle than I
> would be comfortable with.
>
> Bjorn
Fair enough. Thanks for this info. :)

I did just discover linux-next and I built it. Should I be doing this
more often to help find regressions?

I will now concentrate on fixing the problem where pci=nocrs does not
ignore the bus resource. One motherboard I own gives 00-7e or similar,
instead of 00-ff. The nocrs does not help, and I had to patch the kernel
myself. Only acpi=off fixes the problem, while knocking out SMT (MADT),
IOMMU (DMAR) and the ability to suspend without crashing.

If you disagree that nocrs should override bus resource, then let me
know and I will not attempt this.

Nicholas