Re: [PATCH v4 4/6] vfio/type1: check dma map request is within a valid iova range
From: Auger Eric
Date: Wed Feb 28 2018 - 04:02:26 EST
Hi Shameer,
On 27/02/18 10:57, Shameerali Kolothum Thodi wrote:
>
>
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@xxxxxxxxxx]
>> Sent: Tuesday, February 27, 2018 8:27 AM
>> To: Alex Williamson <alex.williamson@xxxxxxxxxx>
>> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>;
>> pmorel@xxxxxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; John Garry
>> <john.garry@xxxxxxxxxx>; xuwei (O) <xuwei5@xxxxxxxxxx>; Robin Murphy
>> <robin.murphy@xxxxxxx>
>> Subject: Re: [PATCH v4 4/6] vfio/type1: check dma map request is within a valid
>> iova range
>>
>> Hi,
>> On 27/02/18 00:13, Alex Williamson wrote:
>>> On Mon, 26 Feb 2018 23:05:43 +0100
>>> Auger Eric <eric.auger@xxxxxxxxxx> wrote:
>>>
>>>> Hi Shameer,
>>>>
>>>> [Adding Robin in CC]
>>>> On 21/02/18 13:22, Shameer Kolothum wrote:
>>>>> This checks and rejects any dma map request outside valid iova
>>>>> range.
>>>>>
>>>>> Signed-off-by: Shameer Kolothum
>> <shameerali.kolothum.thodi@xxxxxxxxxx>
>>>>> ---
>>>>> drivers/vfio/vfio_iommu_type1.c | 22 ++++++++++++++++++++++
>>>>> 1 file changed, 22 insertions(+)
>>>>>
>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
>> b/drivers/vfio/vfio_iommu_type1.c
>>>>> index a80884e..3049393 100644
>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>>> @@ -970,6 +970,23 @@ static int vfio_pin_map_dma(struct vfio_iommu
>> *iommu, struct vfio_dma *dma,
>>>>> return ret;
>>>>> }
>>>>>
>>>>> +/*
>>>>> + * Check dma map request is within a valid iova range
>>>>> + */
>>>>> +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
>>>>> + dma_addr_t start, dma_addr_t end)
>>>>> +{
>>>>> + struct list_head *iova = &iommu->iova_list;
>>>>> + struct vfio_iova *node;
>>>>> +
>>>>> + list_for_each_entry(node, iova, list) {
>>>>> + if ((start >= node->start) && (end <= node->end))
>>>>> + return true;
>>>> I am now confused by the fact this change will prevent existing QEMU
>>>> from working with this series on some platforms. For instance QEMU virt
>>>> machine GPA space collides with Seattle PCI host bridge windows. On ARM
>>>> the smmu and smmuv3 drivers report the PCI host bridge windows as
>>>> reserved regions which does not seem to be the case on other platforms.
>>>> The change happened in commit
>> 273df9635385b2156851c7ee49f40658d7bcb29d
>>>> (iommu/dma: Make PCI window reservation generic).
>>>>
>>>> For background, we already discussed the topic after LPC 2016. See
>>>> https://www.spinics.net/lists/kernel/msg2379607.html.
>>>>
>>>> So is it the right choice to expose PCI host bridge windows as reserved
>>>> regions? If yes shouldn't we make a difference between those and MSI
>>>> windows in this series and do not reject any user space DMA_MAP attempt
>>>> within PCI host bridge windows.
>>>
>>> If the QEMU machine GPA collides with a reserved region today, then
>>> either:
>>>
>>> a) The mapping through the IOMMU works and the reserved region is wrong
>>>
>>> or
>>>
>>> b) The mapping doesn't actually work, QEMU is at risk of data loss by
>>> being told that it worked, and we're justified in changing that
>>> behavior.
>>>
>>> Without knowing the specifics of SMMU, it doesn't particularly make
>>> sense to me to mark the entire PCI hierarchy MMIO range as reserved,
>>> unless perhaps the IOMMU is incapable of translating those IOVAs.
>> to me the limitation does not come from the smmu itself, which is a
>> separate HW block sitting between the root complex and the interconnect.
>> If ACS is not enforced by the PCIe subsystem, the transaction will never
>> reach the IOMMU.
>
> True. And we do have one such platform where ACS is not enforced but
> reserving the regions and possibly creating holes while launching VM will
> make it secure. But I do wonder how we will solve the device grouping
> in such cases.
>
> The Seattle PCI host bridge windows case you mentioned has any pci quirk
> to claim that they support ACS?
No there is none to my knowledge. I am applying Alex' not upstream ACS
overwrite patch.
Thanks
Eric
>
>> In the case of such overlap, shouldn't we just warn the end-user that
>> this situation is dangerous instead of forbidding the use case which
>> worked "in most cases" until now.
>
> Yes, may be something similar to the allow_unsafe_interrupts case, if
> that is acceptable.
>
> Thanks,
> Shameer
>
>>> Are we trying to prevent untranslated p2p with this reserved range?
>>> That's not necessarily a terrible idea, but it seems that doing it for
>>> that purpose would need to be a lot smarter, taking into account ACS
>>> and precisely selecting ranges within the peer address space that would
>>> be untranslated. Perhaps only populated MMIO within non-ACS
>>> hierarchies. Thanks,
>>
>> Indeed taking into account the ACS capability would refine the
>> situations where a risk exists.
>>
>> Thanks
>>
>> Eric
>>>
>>> Alex
>>>