Re: [RFC v4 1/7] PCI: Introduce domain_nr in pci_host_bridge

From: Bjorn Helgaas
Date: Thu Jul 15 2021 - 15:29:43 EST


On Fri, Jul 16, 2021 at 01:30:52AM +0800, Boqun Feng wrote:
> On Wed, Jul 14, 2021 at 02:33:19PM -0500, Bjorn Helgaas wrote:
> > On Wed, Jul 14, 2021 at 06:27:31PM +0800, Boqun Feng wrote:
> > > Currently we retrieve the PCI domain number of the host bridge from the
> > > bus sysdata (or pci_config_window if PCI_DOMAINS_GENERIC=y). Actually
> > > we have the information at PCI host bridge probing time, and it makes
> > > sense that we store it into pci_host_bridge. One benefit of doing so is
> > > the requirement for supporting PCI on Hyper-V for ARM64, because the
> > > host bridge of Hyper-V doesn't have pci_config_window, whereas ARM64 is
> > > a PCI_DOMAINS_GENERIC=y arch, so we cannot retrieve the PCI domain
> > > number from pci_config_window on ARM64 Hyper-V guest.
> > >
> > > As the preparation for ARM64 Hyper-V PCI support, we introduce the
> > > domain_nr in pci_host_bridge and a sentinel value to allow drivers to
> > > set domain numbers properly at probing time. Currently
> > > CONFIG_PCI_DOMAINS_GENERIC=y archs are only users of this
> > > newly-introduced field.
> >
> > Thanks for pushing on this. PCI_DOMAINS_GENERIC is really not very
> > generic today and it will be good to make it more so.
> >
> > > Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> > > ---
> > > drivers/pci/probe.c | 6 +++++-
> > > include/linux/pci.h | 10 ++++++++++
> > > 2 files changed, 15 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > > index 79177ac37880..60c50d4f156f 100644
> > > --- a/drivers/pci/probe.c
> > > +++ b/drivers/pci/probe.c
> > > @@ -594,6 +594,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
> > > bridge->native_pme = 1;
> > > bridge->native_ltr = 1;
> > > bridge->native_dpc = 1;
> > > + bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> > >
> > > device_initialize(&bridge->dev);
> > > }
> > > @@ -898,7 +899,10 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
> > > bus->ops = bridge->ops;
> > > bus->number = bus->busn_res.start = bridge->busnr;
> > > #ifdef CONFIG_PCI_DOMAINS_GENERIC
> > > - bus->domain_nr = pci_bus_find_domain_nr(bus, parent);
> > > + if (bridge->domain_nr == PCI_DOMAIN_NR_NOT_SET)
> > > + bus->domain_nr = pci_bus_find_domain_nr(bus, parent);
> > > + else
> > > + bus->domain_nr = bridge->domain_nr;
> >
> > The domain_nr in struct pci_bus is really only used by
> > pci_domain_nr(). It seems like it really belongs in the struct
> > pci_host_bridge and probably doesn't need to be duplicated in the
> > struct pci_bus. But that's probably a project for the future.
>
> Agreed. Maybe we can define pci_bus_domain_nr() as:
>
> static inline int pci_domain_nr(struct pci_bus *bus)
> {
> struct device *bridge = bus->bridge;
> struct pci_host_bridge *b = container_of(bridge, struct pci_host_bridge, dev);
>
> return b->domain_nr;
> }
>
> but apart from corretness (e.g. should we use get_device() for
> bus->bridge?), it makes more sense if ->domain_nr of pci_host_bridge
> is used (as a way to set domain number at probing time) for most of
> drivers and archs. ;-)

If we're holding a struct pci_bus *, we must have a reference on the
bus, which in turn holds a reference on upstream devices, so there
should be no need for get_device() here.

But yes, I think something like this is where we should be heading.

> > > #endif
> > >
> > > b = pci_find_bus(pci_domain_nr(bus), bridge->busnr);
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 540b377ca8f6..952bb7d46576 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -526,6 +526,15 @@ static inline int pci_channel_offline(struct pci_dev *pdev)
> > > return (pdev->error_state != pci_channel_io_normal);
> > > }
> > >
> > > +/*
> > > + * PCI Conventional has at most 256 PCI bus segments and PCI Express has at
> > > + * most 65536 "PCI Segments Groups", therefore -1 is not a valid PCI domain
> >
> > s/Segments/Segment/
> >
> > Do you have a reference for these limits? I don't think either
> > Conventional PCI or PCIe actually specifies a hardware limit on the
> > number of domains (I think PCI uses "segment group" to mean the same
> > thing).
> >
> > "Segment" in the Conventional PCI spec, r3.0, means a bus segment,
> > which connects all the devices on a single bus. Obviously there's a
> > limit of 256 buses under a single host bridge, but that's different
> > concept than a domain/segment group.
> >
> > The PCI Firmware spec, r3.3, defines "Segment Group Number" as being
> > in the range 0..65535, but as far as I know, that's just a firmware
> > issue, and it applies equally to Conventional PCI and PCIe.
> >
> > I think you're right that -1 is a reasonable sentinel; I just don't
> > want to claim a difference here unless there really is one.
> >
>
> I think you're right, I got confused on the concepts of "Segment" and
> "Segment Group".
>
> After digging in specs, I haven't find any difference on the limitation
> between Conventional PCI and PCIe. The PCI Firmware spec, r3.2, refers
> ACPI (3.0 and later) spec for the details of "Segment Group", and in
> ACPI spec v6.3, the description _SEG object says:
>
> """
> The lower 16 bits of _SEG returned integer is the PCI Segment Group
> number. Other bits are reserved.
> """
>
> So I'm thinking replacing the comments with:
>
> Currently in ACPI spec, for each PCI host bridge, PCI Segment Group
> number is limited to a 16-bit value, therefore (int)-1 is not a valid
> PCI domain number, and can be used as a sentinel value indicating
> ->domain_nr is not set by the driver (and CONFIG_PCI_DOMAINS_GENERIC=y
> archs will set it with pci_bus_find_domain_nr()).

Yes, I think that's a better description.

> > > + * number, and can be used as a sentinel value indicating ->domain_nr is not
> > > + * set by the driver (and CONFIG_PCI_DOMAINS_GENERIC=y can set it in generic
> > > + * code).