Re: [PATCH v9 0/7] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 3/3: vfio changes
From: Pranav Sawargaonkar
Date: Mon Jun 20 2016 - 11:43:30 EST
Hi Eric,
Tested this series on APM X-Gene2 with E1000 and sata sil card.
Tested-By: Pranavkumar Sawargaonkar <psawargaonkar@xxxxxxx>
Thanks,
Pranav
On Thu, Jun 9, 2016 at 1:25 PM, Auger Eric <eric.auger@xxxxxxxxxx> wrote:
> Alex,
>> On Wed, 8 Jun 2016 10:29:35 +0200
>> Auger Eric <eric.auger@xxxxxxxxxx> wrote:
>>
>>> Dear all,
>>> Le 20/05/2016 Ã 18:01, Eric Auger a Ãcrit :
>>>> Alex, Robin,
>>>>
>>>> While my 3 part series primarily addresses the problematic of mapping
>>>> MSI doorbells into arm-smmu, it fails in :
>>>>
>>>> 1) determining whether the MSI controller is downstream or upstream to
>>>> the IOMMU,
>>>> => indicates whether the MSI doorbell must be mapped
>>>> => participates in the decision about 2)
>>>>
>>>> 2) determining whether it is safe to assign a PCIe device.
>>>>
>>>> I think we share this understanding with Robin. All above of course
>>>> stands for ARM.
>>>>
>>>> I get stuck with those 2 issues and I have few questions about iommu
>>>> group setup, PCIe, iommu dt/ACPI description. I would be grateful to you
>>>> if you could answer part of those questions and advise about the
>>>> strategy to fix those.
>>>
>>> gentle reminder about the questions below; hope I did not miss any reply.
>>> If anybody has some time to spent on this topic...
>>>
>>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>
>>>> QUESTIONS:
>>>>
>>>> 1) Robin, you pointed some host controllers which also are MSI
>>>> controllers
>>>> (http://thread.gmane.org/gmane.linux.kernel.pci/47174/focus=47268). In
>>>> that case MSIs never reach the IOMMU. I failed in finding anything about
>>>> MSIs in PCIe ACS spec. What should be the iommu groups in that
>>>> situation. Isn't the upstreamed code able to see some DMA transfers are
>>>> not properly isolated and alias devices in the same group? According to
>>>> your security warning, Alex, I would think the code does not recognize
>>>> it, can you confirm please?
>>> my current understanding is end points would be in separate groups (assuming
>>> ACS support) although MSI controller frame is not properly protected.
>>
>> We don't currently consider MSI differently from other DMA and we don't
>> currently have any sort of concept of a device within the intermediate
>> fabric as being a DMA target. We expect fabric devices to only be
>> transaction routers. We use ACS to determine whether there's any
>> possibility of DMA being redirected before it reaches the IOMMU, but it
>> seems that a DMA being consumed by an interrupt controller before it
>> reaches the IOMMU would be another cause for an isolation breach.
>>
> OK thank you for the confirmation
>>>> 2) can other PCIe components be MSI controllers?
>>
>> I'm not even entirely sure what this means. Would a DMA write from an
>> endpoint target the MMIO space of an intermediate, fabric device?
> With the example provided by Robin we have a host controller acting as
> an MSI controller. I wondered whether we could have some other fabric
> devices (downstream to the host controller in PCIe terminology) also
> likely to act as MSI controllers.
>>
>>>> 3) Am I obliged to consider arbitrary topologies where an MSI controller
>>>> stands between the PCIe host and the iommu? in the PCIe space or
>>>> platform space? If this only relates to PCIe couldn' I check if an MSI
>>>> controller exists in the PCIe tree?
>>> In my last series, I consider the assignment of platform device unsafe as
>>> soon as there is a GICv2m. This is a change in the user experience compared to
>>> what we have before.
>>
>> If the MSI controller is downstream of our DMA translation, it doesn't
>> seem like we have much choice but to mark it unsafe. The endpoint is
>> fully able to attempt to exploit it.
> OK the orginal question was related to non PCIe topologies:
>
> - we know some PCIe fabric topologies where the PCIe host controller
> implements MSI controller.
> - Shall we be prepared to address the same kind of issues with platform
> MSI controllers. Are there some SOCs where we would put an unsafe MSI
> platform controller before IOMMU translation. Or do we consider it is a
> platform topology we don't support for assignment?
>
>>
>>>> 4) Robin suggested in a private thread to enumerate through a list of
>>>> "registered" doorbells and if any belongs to an unsafe MSI controller,
>>>> consider the assignment is unsafe. This would be a first step before
>>>> doing something more complex. Alex, would that be acceptable to you for
>>>> issue #2?
>>> I implemented this technique in my last series waiting for more discussion
>>> on 4, 5.
>>
>> Seems sufficient. I don't mind taking a broad swing versus all the
>> extra complexity of defining which devices are safe vs unsafe.
> OK
>>
>>>> 5) About issue #1: don't we miss tools in dt/ACPI to describe the
>>>> location of the iommu on ARM? This is not needed on x86 because
>>>> irq_remapping and IOMMU are at the same place but my understanding is
>>>> that it is on ARM where
>>>> - there is no connection between the MSI controller - which implements
>>>> irq remapping - and the iommu
>>>> - MSI are conveyed on the same address space as standard memory
>>>> transactions.
>>
>> It seems pretty dubious to me to have fixed address, unprotected MSI
>> controllers sitting in the DMA space of a device before IOMMU
>> translation.
> same for me ;-)
> Seems like you not only need to mark interrupts as
>> unsafe, but exclude the address space of the MSI controller from the
>> available IOVA space to the user.
> I currently do not see how to achieve that. The guest can program the
> assigned device DMA target address with the MSI frame PA. there is no
> IOMMU to protect. How can we make it if we don't trap on DMA programming?
>>
>>>> 6) can't we live with iommu/MSI controller respective location uncertainty?
>>>>
>>>> - in my current series, with the above Xilinx MSI controller, I would
>>>> see there is an arm-smmu requiring mapping behind the PCI host, would
>>>> query the characteristics of the MSI doorbell (not implemented by that
>>>> controller), so no mapping would be done. So it would work I think.
>>>> - However in case we have this topology: PCIe host -> MSI controller
>>>> generally used behind an IOMMU (so registering a doorbell) -> IOMMU,
>>>> this wouldn't work since the doorbell would be mapped.
>>
>> I'm a little confused which direction "behind" is here, but it seems
>> like any time the MSI controller lives in the DMA address space of the
>> endpoint, both interfering with the available IOVA space to the user
>> and potentially an attack vector for the user, we need to call it out
>> as unsafe. Maybe some of them are for exclusive use of the device and
>> the attack vector is relatively contained, but they still affect the
>> IOVA space of the user. Such a configuration might be safe, but as I
>> said I'm not opposed to being pretty liberal in applying the unsafe
>> requirement if the platform has done something unfriendly. Thanks,
> OK that's clear.
>
> Thank you for your feedbacks
>
> Best Regards
>
> Eric
>>
>> Alex
>>