Re: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset

From: Si-Wei Liu
Date: Wed Oct 25 2023 - 19:32:17 EST


Hi Yang Lei,

Thanks for testing my patches and reporting! As for the issue, could you please try what I posted in:

https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@xxxxxxxxxx/

and let me know how it goes? Thank you very much!

Thanks,
-Siwei

On 10/25/2023 2:41 AM, Lei Yang wrote:
On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
Hello Si-Wei
Thanks a lot for testing! Please be aware that there's a follow-up fix
for a potential oops in this v4 series:

The first, when I did not apply this patch [1], I will also hit this
patch mentioned problem. After I applied this patch, this problem will
no longer to hit again. But I hit another issues, about the error
messages please review the attached file.
[1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/

My test steps:
git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux/
b4 am 1697880319-4937-1-git-send-email-si-wei.liu@xxxxxxxxxx
b4 am 20231018171456.1624030-2-dtatulea@xxxxxxxxxx
b4 am 1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx
git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx
git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx
git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx
cp /boot/config-5.14.0-377.el9.x86_64 .config
make -j 32
make modules_install
make install

Thanks

Lei
https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/

Would be nice to have it applied for any tests.

Thanks,
-Siwei

On 10/23/2023 11:51 PM, Lei Yang wrote:
QE tested this series v4 with regression testing on real nic, there is
no new regression bug.

Tested-by: Lei Yang <leiyang@xxxxxxxxxx>

On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:

On 10/22/2023 8:51 PM, Jason Wang wrote:
Hi Si-Wei:

On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
In order to reduce needlessly high setup and teardown cost
of iotlb mapping during live migration, it's crucial to
decouple the vhost-vdpa iotlb abstraction from the virtio
device life cycle, i.e. iotlb mappings should be left
intact across virtio device reset [1]. For it to work, the
on-chip IOMMU parent device could implement a separate
.reset_map() operation callback to restore 1:1 DMA mapping
without having to resort to the .reset() callback, the
latter of which is mainly used to reset virtio device state.
This new .reset_map() callback will be invoked only before
the vhost-vdpa driver is to be removed and detached from
the vdpa bus, such that other vdpa bus drivers, e.g.
virtio-vdpa, can start with 1:1 DMA mapping when they
are attached. For the context, those on-chip IOMMU parent
devices, create the 1:1 DMA mapping at vdpa device creation,
and they would implicitly destroy the 1:1 mapping when
the first .set_map or .dma_map callback is invoked.

This patchset is rebased on top of the latest vhost tree.

[1] Reducing vdpa migration downtime because of memory pin / maps
https://www.mail-archive.com/qemu-devel@xxxxxxxxxx/msg953755.html

---
v4:
- Rework compatibility using new .compat_reset driver op
I still think having a set_backend_feature()
This will overload backend features with the role of carrying over
compatibility quirks, which I tried to avoid from. While I think the
.compat_reset from the v4 code just works with the backend features
acknowledgement (and maybe others as well) to determine, but not
directly tie it to backend features itself. These two have different
implications in terms of requirement, scope and maintaining/deprecation,
better to cope with compat quirks in explicit and driver visible way.

or reset_map(clean=true) might be better.
An explicit op might be marginally better in driver writer's point of
view. Compliant driver doesn't have to bother asserting clean_map never
be true so their code would never bother dealing with this case, as
explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map
during reset for older userspace":

"
The separation of .compat_reset from the regular .reset allows
vhost-vdpa able to know which driver had broken behavior before, so it
can apply the corresponding compatibility quirk to the individual
driver
whenever needed. Compared to overloading the existing .reset with
flags, .compat_reset won't cause any extra burden to the implementation
of every compliant driver.
"

As it tries hard to not introduce new stuff on the bus.
Honestly I don't see substantial difference between these other than the
color. There's no single best solution that stands out among the 3. And
I assume you already noticed it from all the above 3 approaches will
have to go with backend features negotiation, that the 1st vdpa reset
before backend feature negotiation will use the compliant version of
.reset that doesn't clean up the map. While I don't think this nuance
matters much to existing older userspace apps, as the maps should
already get cleaned by previous process in vhost_vdpa_cleanup(), but if
bug-for-bug behavioral compatibility is what you want, module parameter
will be the single best answer.

Regards,
-Siwei

But we can listen to others for sure.

Thanks