Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
From: Auger Eric
Date: Mon May 20 2019 - 08:47:46 EST
Hi Robin,
On 5/16/19 7:53 PM, Robin Murphy wrote:
> On 16/05/2019 14:23, Auger Eric wrote:
>> Hi Robin,
>> On 5/16/19 2:46 PM, Robin Murphy wrote:
>>> On 16/05/2019 11:08, Eric Auger wrote:
>>>> Introduce a new type for reserved region. This corresponds
>>>> to directly mapped regions which are known to be relaxable
>>>> in some specific conditions, such as device assignment use
>>>> case. Well known examples are those used by USB controllers
>>>> providing PS/2 keyboard emulation for pre-boot BIOS and
>>>> early BOOT or RMRRs associated to IGD working in legacy mode.
>>>>
>>>> Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
>>>> from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
>>>> RMRR on graphics devices too"), those regions are currently
>>>> considered "safe" with respect to device assignment use case
>>>> which requires a non direct mapping at IOMMU physical level
>>>> (RAM GPA -> HPA mapping).
>>>>
>>>> Those RMRRs currently exist and sometimes the device is
>>>> attempting to access it but this has not been considered
>>>> an issue until now.
>>>>
>>>> However at the moment, iommu_get_group_resv_regions() is
>>>> not able to make any difference between directly mapped
>>>> regions: those which must be absolutely enforced and those
>>>> like above ones which are known as relaxable.
>>>>
>>>> This is a blocker for reporting severe conflicts between
>>>> non relaxable RMRRs (like MSI doorbells) and guest GPA space.
>>>>
>>>> With this new reserved region type we will be able to use
>>>> iommu_get_group_resv_regions() to enumerate the IOVA space
>>>> that is usable through the IOMMU API without introducing
>>>> regressions with respect to existing device assignment
>>>> use cases (USB and IGD).
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx>
>>>>
>>>> ---
>>>>
>>>> Note: At the moment the sysfs ABI is not changed. However I wonder
>>>> whether it wouldn't be preferable to report the direct region as
>>>> "direct_relaxed" there. At the moment, in case the same direct
>>>> region is used by 2 devices, one USB/GFX and another not belonging
>>>> to the previous categories, the direct region will be output twice
>>>> with "direct" type.
>>>
>>> Hmm, that sounds a bit off - if we have overlapping regions within the
>>> same domain, then we need to do some additional pre-processing to adjust
>>> them anyway, since any part of a relaxable region which overlaps a
>>> non-relaxable region cannot actually be relaxed, and so really should
>>> never be described as such.
>> In iommu_insert_resv_region(), we are overlapping regions of the same
>> type. So iommu_get_group_resv_regions() should return both the relaxable
>> region and non relaxable one. I should test this again using a hacked
>> kernel though.
>
> We should still consider relaxable regions as being able to merge back
> in to regular direct regions to a degree - If a relaxable region falls
> entirely within a direct one then there's no point exposing it because
> the direct region *has* to take precedence and be enforced. If there is
> an incomplete overlap then we could possibly just trust consumers to see
> it and give the direct region precedence themselves, but since the
> relaxable region is our own in-kernel invention rather than firmware
> gospel I think it would be safer to truncate it to just its
> non-overlapping part.
While I understand your reasoning above, and I tend to agree this would
be nice to have, I have spent some time looking at the algo to
merge/split/overlay regions and I am afraid that this will bring a huge
complexity in the insertion function (I am also afraid the existing data
structs may not be well adapted overall). So I am now a bit reluctant to
add such complexty for shrinking relaxable regions that we will
eventually ignore in any case.
Thanks
Eric
>
> Robin.