Re: [PATCH v4] Subject: PCI: Enable io space 1k granularity for intel cpu root port

From: Bjorn Helgaas
Date: Fri Jul 12 2024 - 14:49:08 EST


On Tue, Jul 02, 2024 at 03:56:49AM +0000, Zhou Shengqing wrote:
> This patch add 1k granularity for intel root port bridge. Intel latest
> server CPU support 1K granularity, And there is an BIOS setup item named
> "EN1K", but linux doesn't support it. if an IIO has 5 IOU (SPR has 5 IOUs)
> all are bifurcated 2x8.In a 2P server system,There are 20 P2P bridges
> present. if keep 4K granularity allocation,it need 20*4=80k io space,
> exceeding 64k. I test it in a 16*nvidia 4090s system under intel eaglestrem
> platform. There are six 4090s that cannot be allocated I/O resources.
> So I applied this patch. And I found a similar implementation in quirks.c,
> but it only targets the Intel P64H2 platform.

I think this has potential. Can you include a more complete citation
for the Intel spec? Complete name, document number if available,
revision, section? Hopefully it's publically available?

> Signed-off-by: Zhou Shengqing <zhoushengqing@xxxxxxxxxxx>
> ---
> drivers/pci/quirks.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 568410e64ce6..f30083d51e15 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2562,6 +2562,36 @@ static void quirk_p64h2_1k_io(struct pci_dev *dev)
> }
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1460, quirk_p64h2_1k_io);
>
> +/* Enable 1k I/O space granularity on the intel root port */
> +static void quirk_intel_rootport_1k_io(struct pci_dev *dev)
> +{
> + struct pci_dev *d = NULL;
> + u16 en1k = 0;
> + struct pci_dev *root_port = pcie_find_root_port(dev);
> +
> + if (!root_port)
> + return;

This doesn't seem quite right to me. The point is to set
dev->io_window_1k when "dev" is a Root Port itself and when the EN1K
bit is set in a [8086:09a2] device.

So I don't think we need to *look* for the Root Port, we just need to
check that "dev" itself *is* a Root Port, e.g.,

if (!pci_is_pcie(dev) || pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
return;

> + /*
> + * Per intel sever CPU EDS vol2(register) spec,
> + * Intel Memory Map/Intel VT-d configuration space,
> + * IIO MISC Control (IIOMISCCTRL_1_5_0_CFG) — Offset 1C0h
> + * bit 2.
> + * Enable 1K (EN1K):
> + * This bit when set, enables 1K granularity for I/O space decode
> + * in each of the virtual P2P bridges
> + * corresponding to root ports, and DMI ports.
> + */
> + while ((d = pci_get_device(PCI_VENDOR_ID_INTEL, 0x09a2, d))) {

To be safe, "d" (the [8086:09a2] device) should be on the same bus as
"dev" (with VMD, I think we get Root Ports *below* the VMD bridge,
which would be a different bus, and they presumably are not influenced
by the EN1K bit.

> + pci_read_config_word(d, 0x1c0, &en1k);
> + if (en1k & 0x4) {
> + pci_info(d, "INTEL: System should support 1k io window\n");

If we log this, I think it should be with "dev", not "d", since we
likely will have several Root Ports, and this would lead to several
identical messages. Maybe something like this:

pci_info(dev, "1K I/O windows enabled per %s EN1K setting\n", pci_name(d));

> + dev->io_window_1k = 1;
> + }
> + }
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_intel_rootport_1k_io);
> +
> /*
> * Under some circumstances, AER is not linked with extended capabilities.
> * Force it to be linked by setting the corresponding control bit in the
> --
> 2.39.2
>