RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

From: Shameerali Kolothum Thodi
Date: Tue Nov 12 2019 - 06:08:22 EST


Hi Eric,

> -----Original Message-----
> From: kvmarm-bounces@xxxxxxxxxxxxxxxxxxxxx
> [mailto:kvmarm-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Eric Auger
> Sent: 11 July 2019 14:56
> To: eric.auger.pro@xxxxxxxxx; eric.auger@xxxxxxxxxx;
> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; kvmarm@xxxxxxxxxxxxxxxxxxxxx; joro@xxxxxxxxxx;
> alex.williamson@xxxxxxxxxx; jacob.jun.pan@xxxxxxxxxxxxxxx;
> yi.l.liu@xxxxxxxxx; jean-philippe.brucker@xxxxxxx; will.deacon@xxxxxxx;
> robin.murphy@xxxxxxx
> Cc: kevin.tian@xxxxxxxxx; vincent.stehle@xxxxxxx; ashok.raj@xxxxxxxxx;
> marc.zyngier@xxxxxxx; tina.zhang@xxxxxxxxx
> Subject: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
>
> This series brings the VFIO part of HW nested paging support
> in the SMMUv3.
>
> The series depends on:
> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> (https://www.spinics.net/lists/kernel/msg3187714.html)
>
> 3 new IOCTLs are introduced that allow the userspace to
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
> 3) invalidate stage 1 related caches
>
> They map onto the related new IOMMU API functions.
>
> We introduce the capability to register specific interrupt
> indexes (see [1]). A new DMA_FAULT interrupt index allows to register
> an eventfd to be signaled whenever a stage 1 related fault
> is detected at physical level. Also a specific region allows
> to expose the fault records to the user space.

I am trying to get this running on one of our platform that has smmuv3 dual
stage support. I am seeing some issues with this when an ixgbe vf dev is
made pass-through and is behind a vSMMUv3 in Guest.

Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5

And this is my Qemu cmd line,

./qemu-system-aarch64
-machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host \
-kernel Image \
-drive if=none,file=ubuntu,id=fs \
-device virtio-blk-device,drive=fs \
-device vfio-pci,host=0000:01:10.1 \
-bios QEMU_EFI.fd \
-net none \
-m 4G \
-nographic -D -d -enable-kvm \
-append "console=ttyAMA0 root=/dev/vda rw acpi=force"

The basic ping from Guest works fine,
root@ubuntu:~# ping 10.202.225.185
PING 10.202.225.185 (10.202.225.185) 56(84) bytes of data.
64 bytes from 10.202.225.185: icmp_seq=2 ttl=64 time=0.207 ms
64 bytes from 10.202.225.185: icmp_seq=3 ttl=64 time=0.203 ms
...

But if I increase ping packet size,

root@ubuntu:~# ping -s 1024 10.202.225.185
PING 10.202.225.185 (10.202.225.185) 1024(1052) bytes of data.
1032 bytes from 10.202.225.185: icmp_seq=22 ttl=64 time=0.292 ms
1032 bytes from 10.202.225.185: icmp_seq=23 ttl=64 time=0.207 ms
>From 10.202.225.169 icmp_seq=66 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=67 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=68 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=69 Destination Host Unreachable

And from Host kernel I get,
[ 819.970742] ixgbe 0000:01:00.1 enp1s0f1: 3 Spoofed packets detected
[ 824.002707] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
[ 828.034683] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
[ 830.050673] ixgbe 0000:01:00.1 enp1s0f1: 4 Spoofed packets detected
[ 832.066659] ixgbe 0000:01:00.1 enp1s0f1: 1 Spoofed packets detected
[ 834.082640] ixgbe 0000:01:00.1 enp1s0f1: 3 Spoofed packets detected

Also noted that iperf cannot work as it fails to establish the connection with iperf
server.

Please find attached the trace logs(vfio*, smmuv3*) from Qemu for your reference.
I haven't debugged this further yet and thought of checking with you if this is
something you have seen already or not. Or maybe I am missing something here?

Please let me know.

Thanks,
Shameer

> Best Regards
>
> Eric
>
> This series can be found at:
> https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
>
> It series includes Tina's patch steming from
> [1] "[RFC PATCH v2 1/3] vfio: Use capability chains to handle device
> specific irq" plus patches originally contributed by Yi.
>
> History:
>
> v8 -> v9:
> - introduce specific irq framework
> - single fault region
> - iommu_unregister_device_fault_handler failure case not handled
> yet.
>
> v7 -> v8:
> - rebase on top of v5.2-rc1 and especially
> 8be39a1a04c1 iommu/arm-smmu-v3: Add a master->domain pointer
> - dynamic alloc of s1_cfg/s2_cfg
> - __arm_smmu_tlb_inv_asid/s1_range_nosync
> - check there is no HW MSI regions
> - asid invalidation using pasid extended struct (change in the uapi)
> - add s1_live/s2_live checks
> - move check about support of nested stages in domain finalise
> - fixes in error reporting according to the discussion with Robin
> - reordered the patches to have first iommu/smmuv3 patches and then
> VFIO patches
>
> v6 -> v7:
> - removed device handle from bind/unbind_guest_msi
> - added "iommu/smmuv3: Nested mode single MSI doorbell per domain
> enforcement"
> - added few uapi comments as suggested by Jean, Jacop and Alex
>
> v5 -> v6:
> - Fix compilation issue when CONFIG_IOMMU_API is unset
>
> v4 -> v5:
> - fix bug reported by Vincent: fault handler unregistration now happens in
> vfio_pci_release
> - IOMMU_FAULT_PERM_* moved outside of struct definition + small
> uapi changes suggested by Kean-Philippe (except fetch_addr)
> - iommu: introduce device fault report API: removed the PRI part.
> - see individual logs for more details
> - reset the ste abort flag on detach
>
> v3 -> v4:
> - took into account Alex, jean-Philippe and Robin's comments on v3
> - rework of the smmuv3 driver integration
> - add tear down ops for msi binding and PASID table binding
> - fix S1 fault propagation
> - put fault reporting patches at the beginning of the series following
> Jean-Philippe's request
> - update of the cache invalidate and fault API uapis
> - VFIO fault reporting rework with 2 separate regions and one mmappable
> segment for the fault queue
> - moved to PATCH
>
> v2 -> v3:
> - When registering the S1 MSI binding we now store the device handle. This
> addresses Robin's comment about discimination of devices beonging to
> different S1 groups and using different physical MSI doorbells.
> - Change the fault reporting API: use VFIO_PCI_DMA_FAULT_IRQ_INDEX to
> set the eventfd and expose the faults through an mmappable fault region
>
> v1 -> v2:
> - Added the fault reporting capability
> - asid properly passed on invalidation (fix assignment of multiple
> devices)
> - see individual change logs for more info
>
>
> Eric Auger (8):
> vfio: VFIO_IOMMU_SET_MSI_BINDING
> vfio/pci: Add VFIO_REGION_TYPE_NESTED region type
> vfio/pci: Register an iommu fault handler
> vfio/pci: Allow to mmap the fault queue
> vfio: Add new IRQ for DMA fault reporting
> vfio/pci: Add framework for custom interrupt indices
> vfio/pci: Register and allow DMA FAULT IRQ signaling
> vfio: Document nested stage control
>
> Liu, Yi L (2):
> vfio: VFIO_IOMMU_SET_PASID_TABLE
> vfio: VFIO_IOMMU_CACHE_INVALIDATE
>
> Tina Zhang (1):
> vfio: Use capability chains to handle device specific irq
>
> Documentation/vfio.txt | 77 ++++++++
> drivers/vfio/pci/vfio_pci.c | 283 ++++++++++++++++++++++++++--
> drivers/vfio/pci/vfio_pci_intrs.c | 62 ++++++
> drivers/vfio/pci/vfio_pci_private.h | 24 +++
> drivers/vfio/pci/vfio_pci_rdwr.c | 45 +++++
> drivers/vfio/vfio_iommu_type1.c | 166 ++++++++++++++++
> include/uapi/linux/vfio.h | 109 ++++++++++-
> 7 files changed, 747 insertions(+), 19 deletions(-)
>
> --
> 2.20.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@xxxxxxxxxxxxxxxxxxxxx
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Attachment: trace-20045-vfio-smmu.log
Description: trace-20045-vfio-smmu.log