Re: [PATCH v12 1/1] vfio/nvgpu: Add vfio pci variant module for grace hopper

From: Ankit Agrawal
Date: Wed Oct 25 2023 - 13:15:19 EST


> BTW, it's still never been answered why the latest QEMU series dropped
> the _DSD support.

The _DSD keys were there in v1 to communicate the PXM start id and the
count associated with the device to the VM kernel. In v2, we proposed an
alternative approach to leverage the Generic Initiator (GI) Affinity structure
in SRAT (ACPI Spec 6.5, Section 5.2.16.6) to create NUMA nodes. GI structure
allows an association between a GI (GPU in this case) and proximity domains.
So we create 8 GI structures with a unique PXM Id and the device BDF. This
removes the need for DSD keys as the VM kernel could parse the GI structures
and identify the PXM IDs associated with the device using the BDF.
(E.g. https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdkfd/kfd_crat.c#L1938)

> In light of that, I don't think we should be independently calculating
> the BAR2 region size using roundup_pow_of_two(nvdev->memlength).
> Instead we should be using pci_resource_len() of the physical BAR2 to
> make it evident that this relationship exists.

Sure, I will make the change in the next posting.

> The comments throughout should also be updated to reflect this as
> currently they're written as if there is no physical BAR2 and we're
> making a completely independent decision relative to BAR2 sizing.  A
> comment should also be added to nvgrace_gpu_vfio_pci_read/write()
> explaining that the physical BAR2 provides the correct behavior
> relative to config space accesses.

Yeah, will update the comments.

> The probe function should also fail if pci_resource_len() for BAR2 is
> not sufficient for the coherent memory region.  Thanks,

Ack.