RE: [PATCH v3 1/1] vfio/nvgrace-gpu: Add Blackwell-Next GPU readiness check via CXL DVSEC

From: Tian, Kevin

Date: Fri Apr 17 2026 - 03:16:36 EST


> From: Ankit Agrawal <ankita@xxxxxxxxxx>
> Sent: Thursday, April 16, 2026 9:45 AM
>
> +
> +static int nvgrace_gpu_wait_device_ready_cxl(struct
> nvgrace_gpu_pci_core_device *nvdev)
> +{
> + struct pci_dev *pdev = nvdev->core_device.pdev;
> + int cxl_dvsec = nvdev->cxl_dvsec;
> + unsigned long mem_info_valid_deadline;
> + unsigned long timeout;
> + u32 dvsec_memory_status;
> + u8 mem_active_timeout;
> +
> + pci_read_config_dword(pdev, cxl_dvsec +
> PCI_DVSEC_CXL_RANGE_SIZE_LOW(0),
> + &dvsec_memory_status);
> +
> + if (cxl_dvsec_mem_is_active(dvsec_memory_status))
> + return 0;
> +
> + mem_active_timeout =
> FIELD_GET(PCI_DVSEC_CXL_MEM_ACTIVE_TIMEOUT,
> + dvsec_memory_status);

Sashiko pointed out that " the Memory_Active_Timeout field is
only valid when the Memory_Info_Valid bit is set ". If it's true
then blindly reading it here is incorrect.

https://sashiko.dev/#/patchset/20260416014504.63067-1-ankita%40nvidia.com

> @@ -1146,11 +1218,16 @@ static bool
> nvgrace_gpu_has_mig_hw_bug(struct pci_dev *pdev)
> * Ensure that the BAR0 region is enabled before accessing the
> * registers.
> */
> -static int nvgrace_gpu_probe_check_device_ready(struct pci_dev *pdev)
> +static int nvgrace_gpu_probe_check_device_ready(struct
> nvgrace_gpu_pci_core_device *nvdev)

the comment above should be updated too.