Re: [PATCH] iommu/amd: page-specific invalidations for more than one page

From: Nadav Amit
Date: Wed Apr 07 2021 - 13:57:36 EST




> On Apr 7, 2021, at 3:01 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
>
> On Tue, Mar 23, 2021 at 02:06:19PM -0700, Nadav Amit wrote:
>> From: Nadav Amit <namit@xxxxxxxxxx>
>>
>> Currently, IOMMU invalidations and device-IOTLB invalidations using
>> AMD IOMMU fall back to full address-space invalidation if more than a
>> single page need to be flushed.
>>
>> Full flushes are especially inefficient when the IOMMU is virtualized by
>> a hypervisor, since it requires the hypervisor to synchronize the entire
>> address-space.
>>
>> AMD IOMMUs allow to provide a mask to perform page-specific
>> invalidations for multiple pages that match the address. The mask is
>> encoded as part of the address, and the first zero bit in the address
>> (in bits [51:12]) indicates the mask size.
>>
>> Use this hardware feature to perform selective IOMMU and IOTLB flushes.
>> Combine the logic between both for better code reuse.
>>
>> The IOMMU invalidations passed a smoke-test. The device IOTLB
>> invalidations are untested.
>
> Have you thoroughly tested this on real hardware? I had a patch-set
> doing the same many years ago and it lead to data corruption under load.
> Back then it could have been a bug in my code of course, but it made me
> cautious about using targeted invalidations.

I tested it on real bare-metal hardware. I ran some basic I/O workloads
with the IOMMU enabled, checkers enabled/disabled, and so on.

However, I only tested the IOMMU-flushes and I did not test that the
device-IOTLB flush work, since I did not have the hardware for that.

If you can refer me to the old patches, I will have a look and see
whether I can see a difference in the logic or test them. If you want
me to run different tests - let me know. If you want me to remove
the device-IOTLB invalidations logic - that is also fine with me.