Re: [GIT PULL] PCI fixes for v5.7

From: Bjorn Helgaas
Date: Thu Apr 23 2020 - 23:23:09 EST


On Thu, Apr 23, 2020 at 11:22:20AM -0700, Linus Torvalds wrote:
> On Thu, Apr 23, 2020 at 10:40 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >
> > - Workaround Apex TPU class code issue that prevents resource
> > assignment (Bjorn Helgaas)
>
> Hmm.
>
> I have no objections to that patch, but I do wonder if it might not be
> better to try to actually assign the resource at enable_resource time?
>
> Put another way: if I read the situation correctly, what happened is
> that the hardware is broken and doesn't have the proper class code,
> and so the resource is not initially assigned at all. But then the
> driver matches on the device ID, and tries to use the device, and then
> we get into trouble at pci_enable_resources().

Exactly.

> But is there any reason we don't just at least try to do
> pci_assign_resource() at that point? Yeah, because we didn't do it at
> bus scanning, maybe there's no room for it, but that's what we do for
> the PCI ROM resources (which I think we also don't claim by default)
> when drivers ask to map them.

That might make sense, but I think we should be consistent with the
checking __dev_sort_resources() does, e.g., skipping
PCI_CLASS_NOT_DEFINED, or at least understand why it's safe to be
different.

> The pci/rom.c code does
>
> /* assign the ROM an address if it doesn't have one */
> if (res->parent == NULL && pci_assign_resource(pdev, PCI_ROM_RESOURCE))
> return NULL;
>
> could we perhaps do the same in enable_resource?
>
> Your patch is obviously much better for an -rc kernel, so this is more
> of a longer-term "wouldn't it be less fragile to ..." query.
>
> Alternatively, maybe we should do resource assignment even for
> PCI_CLASS_NOT_DEFINED?

Yeah. I don't know the history of why we skip PCI_CLASS_NOT_DEFINED.
I did consider about the fact that we're skipping it, to make it
easier to debug next time.

PCI_CLASS_NOT_DEFINED is supposed to be for devices built before the
Class Code field was defined. That note is at least as old as PCI 2.2
from 1998, so there shouldn't be *that* many of those devices left.

Bjorn