Re: [PATCH RFC 1/4] vdpa: introduce .reset_map operation callback

From: Si-Wei Liu
Date: Mon Aug 21 2023 - 18:31:47 EST




On 8/17/2023 8:28 AM, Eugenio Perez Martin wrote:
On Thu, Aug 17, 2023 at 2:05 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:


On 8/15/2023 6:55 PM, Jason Wang wrote:
On Wed, Aug 16, 2023 at 3:49 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:

On 8/14/2023 7:21 PM, Jason Wang wrote:
On Tue, Aug 15, 2023 at 9:46 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
Signed-off-by: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
---
include/linux/vdpa.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index db1b0ea..3a3878d 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -314,6 +314,12 @@ struct vdpa_map_file {
* @iova: iova to be unmapped
* @size: size of the area
* Returns integer: success (0) or error (< 0)
+ * @reset_map: Reset device memory mapping (optional)
+ * Needed for device that using device
+ * specific DMA translation (on-chip IOMMU)
This exposes the device internal to the upper layer which is not optimal.
Not sure what does it mean by "device internal", but this op callback
just follows existing convention to describe what vdpa parent this API
targets.
I meant the bus tries to hide the differences among vendors. So it
needs to hide on-chip IOMMU stuff to the upper layer.

We can expose two dimensional IO mappings models but it looks like
over engineering for this issue. More below.

* @set_map: Set device memory mapping (optional)
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
:
:
* @dma_map: Map an area of PA to IOVA (optional)
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
* and preferring incremental map.
:
:
* @dma_unmap: Unmap an area of IOVA (optional but
* must be implemented with dma_map)
* Needed for device that using device
* specific DMA translation (on-chip IOMMU)
* and preferring incremental unmap.


Btw, what's the difference between this and a simple

set_map(NULL)?
I don't think parent drivers support this today - they can accept
non-NULL iotlb containing empty map entry, but not a NULL iotlb. The
behavior is undefined or it even causes panic when a NULL iotlb is
passed in.
We can do this simple change if it can work.
If we go with setting up 1:1 DMA mapping at virtio-vdpa .probe() and
tearing it down at .release(), perhaps set_map(NULL) is not sufficient.
Further this doesn't work with .dma_map parent drivers.
Probably, but I'd remove dma_map as it doesn't have any real users
except for the simulator.
OK, at a point there was suggestion to get this incremental API extended
to support batching to be in par with or even replace .set_map, not sure
if it's too soon to conclude. But I'm okay with the removal if need be.
Yes, I think the right move in the long run is to delegate the
batching to the parent driver. This allows drivers like mlx to add
memory (like hotplugged memory) without the need of tearing down all
the old maps.
Nods.


Having said that, maybe we can work on top if we need to remove
.dma_map for now.
I guess for that sake I would keep .dma_map unless there's strong objection against it.

Thanks,
-Siwei


The reason why a new op is needed or better is because it allows
userspace to tell apart different reset behavior from the older kernel
(via the F_IOTLB_PERSIST feature bit in patch 4), while this behavior
could vary between parent drivers.
I'm ok with a new feature flag, but we need to first seek a way to
reuse the existing API.
A feature flag is needed anyway. I'm fine with reusing but guess I'd
want to converge on the direction first.

Thanks,
-Siwei
Thanks

Regards,
-Siwei

Thanks

+ * @vdev: vdpa device
+ * @asid: address space identifier
+ * Returns integer: success (0) or error (< 0)
* @get_vq_dma_dev: Get the dma device for a specific
* virtqueue (optional)
* @vdev: vdpa device
@@ -390,6 +396,7 @@ struct vdpa_config_ops {
u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
u64 iova, u64 size);
+ int (*reset_map)(struct vdpa_device *vdev, unsigned int asid);
int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
unsigned int asid);
struct device *(*get_vq_dma_dev)(struct vdpa_device *vdev, u16 idx);
--
1.8.3.1