RE: [PATCH v7 5/6] hisi_acc_vfio_pci: bugfix live migration function without VF device driver

From: Shameerali Kolothum Thodi
Date: Tue Apr 15 2025 - 04:46:21 EST




> -----Original Message-----
> From: liulongfang <liulongfang@xxxxxxxxxx>
> Sent: Friday, April 11, 2025 4:59 AM
> To: alex.williamson@xxxxxxxxxx; jgg@xxxxxxxxxx; Shameerali Kolothum
> Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> linuxarm@xxxxxxxxxxxxx; liulongfang <liulongfang@xxxxxxxxxx>
> Subject: [PATCH v7 5/6] hisi_acc_vfio_pci: bugfix live migration function
> without VF device driver
>
> If the VF device driver is not loaded in the Guest OS and we attempt to
> perform device data migration, the address of the migrated data will
> be NULL.
> The live migration recovery operation on the destination side will
> access a null address value, which will cause access errors.
>
> Therefore, live migration of VMs without added VF device drivers
> does not require device data migration.
> In addition, when the queue address data obtained by the destination
> is empty, device queue recovery processing will not be performed.
>
> Fixes: b0eed085903e ("hisi_acc_vfio_pci: Add support for VFIO live
> migration")
> Signed-off-by: Longfang Liu <liulongfang@xxxxxxxxxx>
> Reviewed-by: Shameer Kolothum
> <shameerali.kolothum.thodi@xxxxxxxxxx>
> ---
> .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 22 +++++++++++++------
> 1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> index cadc82419dca..d12a350440d3 100644
> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
> @@ -426,13 +426,6 @@ static int vf_qm_check_match(struct
> hisi_acc_vf_core_device *hisi_acc_vdev,
> return -EINVAL;
> }
>
> - ret = qm_write_regs(vf_qm, QM_VF_STATE, &vf_data->vf_qm_state,
> 1);
> - if (ret) {
> - dev_err(dev, "failed to write QM_VF_STATE\n");
> - return ret;
> - }
> -
> - hisi_acc_vdev->vf_qm_state = vf_data->vf_qm_state;
> hisi_acc_vdev->match_done = true;
> return 0;
> }
> @@ -498,6 +491,20 @@ static int vf_qm_load_data(struct
> hisi_acc_vf_core_device *hisi_acc_vdev,
> if (migf->total_length < sizeof(struct acc_vf_data))
> return -EINVAL;
>
> + if (!vf_data->eqe_dma || !vf_data->aeqe_dma ||
> + !vf_data->sqc_dma || !vf_data->cqc_dma) {
> + dev_info(dev, "resume dma addr is NULL!\n");
> + hisi_acc_vdev->vf_qm_state = QM_NOT_READY;
> + return 0;
> + }


I am still not fully understood why we need the above check. The only case as
far as I can think of where this will happen is when the source VM Guest has
not loaded the ACC driver. And we do take care of that already using vf_qm_state and
checking total_length == QM_MATCH_SIZE.

Have you seen this happening in any other scenario during your tests?

Thanks,
Shameer

> +
> + ret = qm_write_regs(qm, QM_VF_STATE, &vf_data->vf_qm_state, 1);
> + if (ret) {
> + dev_err(dev, "failed to write QM_VF_STATE\n");
> + return -EINVAL;
> + }
> + hisi_acc_vdev->vf_qm_state = vf_data->vf_qm_state;
> +
> qm->eqe_dma = vf_data->eqe_dma;
> qm->aeqe_dma = vf_data->aeqe_dma;
> qm->sqc_dma = vf_data->sqc_dma;
> @@ -1531,6 +1538,7 @@ static int hisi_acc_vfio_pci_migrn_init_dev(struct
> vfio_device *core_vdev)
> hisi_acc_vdev->vf_id = pci_iov_vf_id(pdev) + 1;
> hisi_acc_vdev->pf_qm = pf_qm;
> hisi_acc_vdev->vf_dev = pdev;
> + hisi_acc_vdev->vf_qm_state = QM_NOT_READY;
> mutex_init(&hisi_acc_vdev->state_mutex);
> mutex_init(&hisi_acc_vdev->open_mutex);
>
> --
> 2.24.0