Re: [PATCH v4 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

From: Alexey Kardashevskiy
Date: Mon Dec 02 2019 - 23:25:50 EST




On 03/12/2019 15:05, Ram Pai wrote:
> On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrote:
>>
>>
>> On 03/12/2019 13:08, Ram Pai wrote:
>>> On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 02/12/2019 17:45, Ram Pai wrote:
>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
>>>>> its parameters. One page is dedicated per cpu, for the lifetime of the
>>>>> kernel for this purpose. On secure VMs, contents of this page, when
>>>>> accessed by the hypervisor, retrieves encrypted TCE entries. Hypervisor
>>>>> needs to know the unencrypted entries, to update the TCE table
>>>>> accordingly. There is nothing secret or sensitive about these entries.
>>>>> Hence share the page with the hypervisor.
>>>>
>>>> This unsecures a page in the guest in a random place which creates an
>>>> additional attack surface which is hard to exploit indeed but
>>>> nevertheless it is there.
>>>> A safer option would be not to use the
>>>> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MULTITCE
>>>> in the guest).
>>>
>>>
>>> Hmm... How do we not use it? AFAICT hcall-multi-tce option gets invoked
>>> automatically when IOMMU option is enabled.
>>
>> It is advertised by QEMU but the guest does not have to use it.
>
> Are you suggesting that even normal-guest, not use hcall-multi-tce?
> or just secure-guest?


Just secure.


>
>>
>>> This happens even
>>> on a normal VM when IOMMU is enabled.
>>>
>>>
>>>>
>>>> Also what is this for anyway?
>>>
>>> This is for sending indirect-TCE entries to the hypervisor.
>>> The hypervisor must be able to read those TCE entries, so that it can
>>> use those entires to populate the TCE table with the correct mappings.
>>>
>>>> if I understand things right, you cannot
>>>> map any random guest memory, you should only be mapping that 64MB-ish
>>>> bounce buffer array but 1) I do not see that happening (I may have
>>>> missed it) 2) it should be done once and it takes a little time for
>>>> whatever memory size we allow for bounce buffers anyway. Thanks,
>>>
>>> Any random guest memory can be shared by the guest.
>>
>> Yes but we do not want this to be this random.
>
> It is not sharing some random page. It is sharing a page that is
> ear-marked for communicating TCE entries. Yes the address of the page
> can be random, depending on where the allocator decides to allocate it.
> The purpose of the page is not random.

I was talking about the location.


> That page is used for one specific purpose; to communicate the TCE
> entries to the hypervisor.
>
>> I thought the whole idea
>> of swiotlb was to restrict the amount of shared memory to bare minimum,
>> what do I miss?
>
> I think, you are making a incorrect connection between this patch and
> SWIOTLB. This patch has nothing to do with SWIOTLB.

I can see this and this is the confusing part.


>>
>>> Maybe you are confusing this with the SWIOTLB bounce buffers used by
>>> PCI devices, to transfer data to the hypervisor?
>>
>> Is not this for pci+swiotlb?
>
>
> No. This patch is NOT for PCI+SWIOTLB. The SWIOTLB pages are a
> different set of pages allocated and earmarked for bounce buffering.
>
> This patch is purely to help the hypervisor setup the TCE table, in the
> presence of a IOMMU.

Then the hypervisor should be able to access the guest pages mapped for
DMA and these pages should be made unsecure for this to work. Where/when
does this happen?


>> The cover letter suggests it is for
>> virtio-scsi-_pci_ with iommu_platform=on which makes it a
>> normal pci device just like emulated XHCI. Thanks,
>
> Well, I guess, the cover letter is probably confusing. There are two
> patches, which togather enable virtio on secure guests, in the presence
> of IOMMU.
>
> The second patch enables virtio in the presence of a IOMMU, to use
> DMA_ops+SWIOTLB infrastructure, to correctly navigate the I/O to virtio
> devices.

The second patch does nothing in relation to the problem being solved.


> However that by itself wont work if the TCE entires are not correctly
> setup in the TCE tables. The first patch; i.e this patch, helps
> accomplish that.
>> Hope this clears up the confusion.





--
Alexey