Re: [PATCH v4 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

From: David Gibson
Date: Tue Dec 03 2019 - 23:44:14 EST


On Wed, Dec 04, 2019 at 12:08:09PM +1100, Alexey Kardashevskiy wrote:
>
>
> On 04/12/2019 11:49, Ram Pai wrote:
> > On Wed, Dec 04, 2019 at 11:04:04AM +1100, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 04/12/2019 03:52, Ram Pai wrote:
> >>> On Tue, Dec 03, 2019 at 03:24:37PM +1100, Alexey Kardashevskiy wrote:
> >>>>
> >>>>
> >>>> On 03/12/2019 15:05, Ram Pai wrote:
> >>>>> On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 03/12/2019 13:08, Ram Pai wrote:
> >>>>>>> On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 02/12/2019 17:45, Ram Pai wrote:
> >>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
> >>>>>>>>> its parameters. One page is dedicated per cpu, for the lifetime of the
> >>>>>>>>> kernel for this purpose. On secure VMs, contents of this page, when
> >>>>>>>>> accessed by the hypervisor, retrieves encrypted TCE entries. Hypervisor
> >>>>>>>>> needs to know the unencrypted entries, to update the TCE table
> >>>>>>>>> accordingly. There is nothing secret or sensitive about these entries.
> >>>>>>>>> Hence share the page with the hypervisor.
> >>>>>>>>
> >>>>>>>> This unsecures a page in the guest in a random place which creates an
> >>>>>>>> additional attack surface which is hard to exploit indeed but
> >>>>>>>> nevertheless it is there.
> >>>>>>>> A safer option would be not to use the
> >>>>>>>> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MULTITCE
> >>>>>>>> in the guest).
> >>>>>>>
> >>>>>>>
> >>>>>>> Hmm... How do we not use it? AFAICT hcall-multi-tce option gets invoked
> >>>>>>> automatically when IOMMU option is enabled.
> >>>>>>
> >>>>>> It is advertised by QEMU but the guest does not have to use it.
> >>>>>
> >>>>> Are you suggesting that even normal-guest, not use hcall-multi-tce?
> >>>>> or just secure-guest?
> >>>>
> >>>>
> >>>> Just secure.
> >>>
> >>> hmm.. how are the TCE entries communicated to the hypervisor, if
> >>> hcall-multi-tce is disabled?
> >>
> >> Via H_PUT_TCE which updates 1 entry at once (sets or clears).
> >> hcall-multi-tce enables H_PUT_TCE_INDIRECT (512 entries at once) and
> >> H_STUFF_TCE (clearing, up to 4bln at once? many), these are simply an
> >> optimization.
> >
> > Do you still think, secure-VM should use H_PUT_TCE and not
> > H_PUT_TCE_INDIRECT? And normal VM should use H_PUT_TCE_INDIRECT?
> > Is there any advantage of special casing it for secure-VMs.
>
>
> Reducing the amount of insecure memory at random location.

The other approach we could use for that - which would still allow
H_PUT_TCE_INDIRECT, would be to allocate the TCE buffer page from the
same pool that we use for the bounce buffers. I assume there must
already be some sort of allocator for that?

> > In fact, we could make use of as much optimization as possible.
> >
> >
> >>
> >>>>>> Is not this for pci+swiotlb?
> > ..snip..
> >>>>> This patch is purely to help the hypervisor setup the TCE table, in the
> >>>>> presence of a IOMMU.
> >>>>
> >>>> Then the hypervisor should be able to access the guest pages mapped for
> >>>> DMA and these pages should be made unsecure for this to work. Where/when
> >>>> does this happen?
> >>>
> >>> This happens in the SWIOTLB code. The code to do that is already
> >>> upstream.
> >>>
> >>> The sharing of the pages containing the SWIOTLB bounce buffers is done
> >>> in init_svm() which calls swiotlb_update_mem_attributes() which calls
> >>> set_memory_decrypted(). In the case of pseries, set_memory_decrypted() calls
> >>> uv_share_page().
> >>
> >>
> >> This does not seem enough as when you enforce iommu_platform=on, QEMU
> >> starts accessing virtio buffers via IOMMU so bounce buffers have to be
> >> mapped explicitly, via H_PUT_TCE&co, where does this happen?
> >>
> >
> > I think, it happens at boot time. Every page of the guest memory is TCE
> > mapped, if iommu is enabled. SWIOTLB pages get implicitly TCE-mapped
> > as part of that operation.
>
>
> Ah I see. This works via the huge dma window. Ok, makes sense now.
>
> It just seems like a waste that we could map swiotlb 1:1 via the always
> existing small DMA window but instead we rely on a huge window to map
> these small buffers. This way we are wasting the entire 32bit window and
> most of the huge window. We may fix it in the future (not right now) but
> for now I would still avoid unsecuring additional memory. Thanks,
>
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature