RE: [PATCH v4 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.
From: Ram Pai
Date: Mon Dec 02 2019 - 23:05:28 EST
On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrote:
>
>
> On 03/12/2019 13:08, Ram Pai wrote:
> > On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 02/12/2019 17:45, Ram Pai wrote:
> >>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
> >>> its parameters. One page is dedicated per cpu, for the lifetime of the
> >>> kernel for this purpose. On secure VMs, contents of this page, when
> >>> accessed by the hypervisor, retrieves encrypted TCE entries. Hypervisor
> >>> needs to know the unencrypted entries, to update the TCE table
> >>> accordingly. There is nothing secret or sensitive about these entries.
> >>> Hence share the page with the hypervisor.
> >>
> >> This unsecures a page in the guest in a random place which creates an
> >> additional attack surface which is hard to exploit indeed but
> >> nevertheless it is there.
> >> A safer option would be not to use the
> >> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MULTITCE
> >> in the guest).
> >
> >
> > Hmm... How do we not use it? AFAICT hcall-multi-tce option gets invoked
> > automatically when IOMMU option is enabled.
>
> It is advertised by QEMU but the guest does not have to use it.
Are you suggesting that even normal-guest, not use hcall-multi-tce?
or just secure-guest?
>
> > This happens even
> > on a normal VM when IOMMU is enabled.
> >
> >
> >>
> >> Also what is this for anyway?
> >
> > This is for sending indirect-TCE entries to the hypervisor.
> > The hypervisor must be able to read those TCE entries, so that it can
> > use those entires to populate the TCE table with the correct mappings.
> >
> >> if I understand things right, you cannot
> >> map any random guest memory, you should only be mapping that 64MB-ish
> >> bounce buffer array but 1) I do not see that happening (I may have
> >> missed it) 2) it should be done once and it takes a little time for
> >> whatever memory size we allow for bounce buffers anyway. Thanks,
> >
> > Any random guest memory can be shared by the guest.
>
> Yes but we do not want this to be this random.
It is not sharing some random page. It is sharing a page that is
ear-marked for communicating TCE entries. Yes the address of the page
can be random, depending on where the allocator decides to allocate it.
The purpose of the page is not random.
That page is used for one specific purpose; to communicate the TCE
entries to the hypervisor.
> I thought the whole idea
> of swiotlb was to restrict the amount of shared memory to bare minimum,
> what do I miss?
I think, you are making a incorrect connection between this patch and
SWIOTLB. This patch has nothing to do with SWIOTLB.
>
> > Maybe you are confusing this with the SWIOTLB bounce buffers used by
> > PCI devices, to transfer data to the hypervisor?
>
> Is not this for pci+swiotlb?
No. This patch is NOT for PCI+SWIOTLB. The SWIOTLB pages are a
different set of pages allocated and earmarked for bounce buffering.
This patch is purely to help the hypervisor setup the TCE table, in the
presence of a IOMMU.
>The cover letter suggests it is for
> virtio-scsi-_pci_ with iommu_platform=on which makes it a
> normal pci device just like emulated XHCI. Thanks,
Well, I guess, the cover letter is probably confusing. There are two
patches, which togather enable virtio on secure guests, in the presence
of IOMMU.
The second patch enables virtio in the presence of a IOMMU, to use
DMA_ops+SWIOTLB infrastructure, to correctly navigate the I/O to virtio
devices.
However that by itself wont work if the TCE entires are not correctly
setup in the TCE tables. The first patch; i.e this patch, helps
accomplish that.
Hope this clears up the confusion.
RP