Re: [PATCH v6 6/6] vfio/nvgrace-gpu: wait for the GPU mem to be ready

From: Zhi Wang
Date: Tue Nov 25 2025 - 15:30:03 EST

Next message: David Heidelberg via B4 Relay: "[PATCH v4 1/8] dt-bindings: arm: qcom: Add Pixel 3 and 3 XL"
Previous message: syzbot: "Re: [syzbot] [kvm-x86?] WARNING in kvm_apic_accept_events (2)"
In reply to: ankita: "[PATCH v6 6/6] vfio/nvgrace-gpu: wait for the GPU mem to be ready"
Next in thread: ankita: "[PATCH v6 5/6] vfio/nvgrace-gpu: Inform devmem unmapped after reset"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, 25 Nov 2025 17:30:13 +0000
<ankita@xxxxxxxxxx> wrote:

> From: Ankit Agrawal <ankita@xxxxxxxxxx>
>
> Speculative prefetches from CPU to GPU memory until the GPU is
> ready after reset can cause harmless corrected RAS events to
> be logged on Grace systems. It is thus preferred that the
> mapping not be re-established until the GPU is ready post reset.
>
> The GPU readiness can be checked through BAR0 registers similar
> to the checking at the time of device probe.
>
> It can take several seconds for the GPU to be ready. So it is
> desirable that the time overlaps as much of the VM startup as
> possible to reduce impact on the VM bootup time. The GPU
> readiness state is thus checked on the first fault/huge_fault
> request or read/write access which amortizes the GPU readiness
> time.
>

snip

> @@ -179,8 +215,12 @@ static vm_fault_t
> nvgrace_gpu_vfio_pci_huge_fault(struct vm_fault *vmf, pfn & ((1 <<
> order) - 1))) return VM_FAULT_FALLBACK;
>
> - scoped_guard(rwsem_read, &vdev->memory_lock)
> + scoped_guard(rwsem_read, &vdev->memory_lock) {
> + if (nvgrace_gpu_check_device_ready(nvdev))
> + return ret;
> +

I would suggest opening the error code if we don't have a "bailing
out without touching the ret" similar to vfio_pci_mmap_huge_fault(),
since this looks unnecessarily confusing.

Please also fix the same in PATCH 2.

> ret = vfio_pci_vmf_insert_pfn(vdev, vmf, pfn, order);
> + }
>
> dev_dbg_ratelimited(&vdev->pdev->dev,
> "%s order = %d pfn 0x%lx: 0x%x\n",
> @@ -592,9 +632,15 @@ nvgrace_gpu_read_mem(struct

Next message: David Heidelberg via B4 Relay: "[PATCH v4 1/8] dt-bindings: arm: qcom: Add Pixel 3 and 3 XL"
Previous message: syzbot: "Re: [syzbot] [kvm-x86?] WARNING in kvm_apic_accept_events (2)"
In reply to: ankita: "[PATCH v6 6/6] vfio/nvgrace-gpu: wait for the GPU mem to be ready"
Next in thread: ankita: "[PATCH v6 5/6] vfio/nvgrace-gpu: Inform devmem unmapped after reset"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]