Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

From: Jean-Philippe Brucker
Date: Thu May 16 2019 - 08:45:42 EST


On 16/05/2019 12:45, Auger Eric wrote:
> Hi Jean-Philippe,
>
> On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:
>> On 16/05/2019 11:08, Eric Auger wrote:
>>> Note: At the moment the sysfs ABI is not changed. However I wonder
>>> whether it wouldn't be preferable to report the direct region as
>>> "direct_relaxed" there. At the moment, in case the same direct
>>> region is used by 2 devices, one USB/GFX and another not belonging
>>> to the previous categories, the direct region will be output twice
>>> with "direct" type.
>>>
>>> This would unblock Shameer's series:
>>> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>>> https://patchwork.kernel.org/patch/10425309/
>>
>> Thanks for doing this!
>>
>>> which failed to get pulled for 4.18 merge window due to IGD
>>> device assignment regression.
>>>
>>> v2 -> v3:
>>> - fix direct type check
>>> ---
>>> drivers/iommu/iommu.c | 12 +++++++-----
>>> include/linux/iommu.h | 6 ++++++
>>> 2 files changed, 13 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index ae4ea5c0e6f9..28c3d6351832 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>>> };
>>>
>>> static const char * const iommu_group_resv_type_string[] = {
>>> - [IOMMU_RESV_DIRECT] = "direct",
>>> - [IOMMU_RESV_RESERVED] = "reserved",
>>> - [IOMMU_RESV_MSI] = "msi",
>>> - [IOMMU_RESV_SW_MSI] = "msi",
>>> + [IOMMU_RESV_DIRECT] = "direct",
>>> + [IOMMU_RESV_DIRECT_RELAXABLE] = "direct",
>>> + [IOMMU_RESV_RESERVED] = "reserved",
>>> + [IOMMU_RESV_MSI] = "msi",
>>> + [IOMMU_RESV_SW_MSI] = "msi",
>>> };
>>>
>>> #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store) \
>>> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
>>> start = ALIGN(entry->start, pg_size);
>>> end = ALIGN(entry->start + entry->length, pg_size);
>>>
>>> - if (entry->type != IOMMU_RESV_DIRECT)
>>> + if (entry->type != IOMMU_RESV_DIRECT &&
>>> + entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>>
>> I'm trying to understand why you need to create direct mappings at all
>> for these relaxable regions. In the host the region is needed for legacy
>> device features, which are disabled (and cannot be re-enabled) when
>> assigning the device to a guest?
> This follows Kevin's comment in the thread below:
> https://patchwork.kernel.org/patch/10449103/#21957279
>
> In normal DMA API host path, those regions need to be 1-1 mapped. They
> are likely to be accessed by the driver or FW at early boot phase or
> even during execution, depending on features being used.
>
> That's the reason, according to Kevin we couldn't hide them.
>
> We just know that, in general, they are not used anymore when assigning
> the device or if accesses are attempted this generally does not block
> the assignment use case. For example, it is said in
> https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
> legacy IGD assignment use case, there may be "a small numbers of DMAR
> faults when initially assigned".

Hmm, fair enough. That doesn't sound too good, if the device might
perform arbitrary writes into guest memory once new IOMMU mappings are
in place. I was wondering if we could report some IOVA ranges as
"available but avoid if possible". If the guest has a vIOMMU, they are
easy to avoid. But I doubt they would ever get used, since probably no
one is going to instantiate a vIOMMU for a graphics device in legacy mode.

Thanks,
Jean