Re: [PATCH] virtio: Force DMA restricted devices through DMA API

From: Keir Fraser
Date: Wed Jul 20 2022 - 04:28:12 EST


On Wed, Jul 20, 2022 at 02:59:53AM -0400, Michael S. Tsirkin wrote:
> On Tue, Jul 19, 2022 at 04:11:50PM +0000, Keir Fraser wrote:
> > On Tue, Jul 19, 2022 at 08:51:54AM -0700, Christoph Hellwig wrote:
> > > On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> > > > However, if the general idea at least is acceptable, would the
> > > > implementation be acceptable if I add an explicit API for this to the
> > > > DMA subsystem, and hide the detail there?
> > >
> > > I don't think so. The right thing to key off is
> > > VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
> > > virtio device after all the problems we had with the lack of it.
> >
> > Ok. Certainly the flag description in virtio spec fits the bill.
>
> Maybe. But balloon really isn't a normal device. Yes the rings kind of
> operate more or less normally. However consider for example free page
> hinting. We stick a page address in ring for purposes of telling host it
> can blow it away at any later time unless we write something into the
> page. Free page reporting is similar.
> Sending gigabytes of memory through swiotlb is at minimum not
> a good idea.

I don't think any balloon use case needs the page's guest data to be
made available to the host device. What the device side does with
reported guest-physical page addresses is somewhat VMM/Hyp specific,
but I think it's fair to say it will know how to reclaim or track
pages by guest PA, and bouncing reported/hinted page addresses through
a SWIOTLB or IOMMU would not be of any use to any use case I can think
of.

As far as I can see, the only translation that makes sense at all for
virtio balloon is in ring management.

> Conversely, is it even okay security wise that host can blow away any
> guest page? Because with balloon GFNs do not go through bounce
> buffering.

Ok, I think this is fair and I can address that by describing the use
case more broadly.

The short answer is that there will be more patches forthcoming,
because the balloon driver will need to tell the hypervisor (EL2 Hyp
in the ARM PKVM case) that is is willingly relinquishing memory
pages. So, there will be a patch to add a new HVC to PKVM Hyp, and a
patch to detect and use the new HVC via a new API that hooks into
Linux's balloon infrastructure.

So the use case is that virtio_balloon needs to set up its rings and
descriptors in a shared memory region, as requested via
dma-restricted-pool and the VIRTIO_F_PALTFORM_ACCESS flag. This is
required because the host device has no access to regular guest
memory.

However, in PKVM, page notifications will notify both the (trusted)
Hypervisor, via hypercall, and the (untrusted) VMM, via virtio. Guest
physical addresses are fine here. The VMM understands guest PAs
perfectly well, it's just not normally allowed to access their
contents or otherwise map or use those pages itself.

>
> > > > Or a completely different approach would be to revert the patch
> > > > e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> > > > driver. MST: That's back in your court, as it's your patch!
> > >
> > > Which also means this needs to be addresses, but I don't think a
> > > simple revert is enough.
> >
> > Well here are two possible approaches:
> >
> > 1. Revert e41b1355508d outright. I'm not even sure what it would mean
> > for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
> > is no longer IOMMU-specific anyway.
> >
> > 2. Continue to clear the flag during virtio_balloon negotiation, but
> > remember that it was offered, and test for that in vring_use_dma_api()
> > as well as, or instead of, virtio_has_dma_quirk().
> >
> > Do either of those appeal?
>
> I think the use case needs to be documented better.

I hope the above is helpful in giving some more context.

Perhaps it would make more sense to re-submit this patch as part of
a larger series that includes the rest of the mechanism needed to
support virtio_balloon on PKVM?

Thanks,
Keir

>
> --
> MST
>
>