Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path

From: Konrad Rzeszutek Wilk
Date: Tue Feb 02 2021 - 18:15:02 EST


On Tue, Feb 02, 2021 at 04:34:09PM -0600, Tom Lendacky wrote:
> On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote:
> > On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:
> >> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
> >>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> >>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> >>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> >>>>>> The size of the buffer being bounced is not checked if it happens
> >>>>>> to be larger than the size of the mapped buffer. Because the size
> >>>>>> can be controlled by a device, as it's the case with virtio devices,
> >>>>>> this can lead to memory corruption.
> >>>>>>
> >>>>>
> >>>>> I'm really worried about all these hodge podge hacks for not trusted
> >>>>> hypervisors in the I/O stack. Instead of trying to harden protocols
> >>>>> that are fundamentally not designed for this, how about instead coming
> >>>>> up with a new paravirtualized I/O interface that is specifically
> >>>>> designed for use with an untrusted hypervisor from the start?
> >>>>
> >>>> Your comment makes sense but then that would require the cooperation
> >>>> of these vendors and the cloud providers to agree on something meaningful.
> >>>> I am also not sure whether the end result would be better than hardening
> >>>> this interface to catch corruption. There is already some validation in
> >>>> unmap path anyway.
> >>>>
> >>>> Another possibility is to move this hardening to the common virtio code,
> >>>> but I think the code may become more complicated there since it would
> >>>> require tracking both the dma_addr and length for each descriptor.
> >>>
> >>> Christoph,
> >>>
> >>> I've been wrestling with the same thing - this is specific to busted
> >>> drivers. And in reality you could do the same thing with a hardware
> >>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&reserved=0) - where the
> >>> mitigation is 'enable the IOMMU to do its job.'.
> >>>
> >>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
> >>> and while that is great in the future, SEV without IOMMU is now here.
> >>>
> >>> Doing a full circle here, this issue can be exploited with virtio
> >>> but you could say do that with real hardware too if you hacked the
> >>> firmware, so if you say used Intel SR-IOV NIC that was compromised
> >>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
> >>> of the guest would be SWIOTLB code. Last line of defense against
> >>> bad firmware to say.
> >>>
> >>> As such I am leaning towards taking this code, but I am worried
> >>> about the performance hit .. but perhaps I shouldn't as if you
> >>> are using SWIOTLB=force already you are kind of taking a
> >>> performance hit?
> >>>
> >>
> >> I have not measured the performance degradation. This will hit all AMD SEV,
> >> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
> >> be large since there are only few added operations per hundreads of copied
> >> bytes. I could try to measure the performance hit by running some benchmark
> >> with virtio-net/virtio-blk/virtio-rng.
> >>
> >> Earlier I said:
> >>>> Another possibility is to move this hardening to the common virtio code,
> >>>> but I think the code may become more complicated there since it would
> >>>> require tracking both the dma_addr and length for each descriptor.
> >>
> >> Unfortunately, this doesn't make sense. Even if there's validation for
> >> the size in the common virtio layer, there will be some other device
> >> which controls a dma_addr and length passed to dma_unmap* in the
> >> corresponding driver. The device can target a specific dma-mapped private
> >> buffer by changing the dma_addr and set a good length to overwrite buffers
> >> following it.
> >>
> >> So, instead of doing the check in every driver and hitting a performance
> >> cost even when swiotlb is not used, it's probably better to fix it in
> >> swiotlb.
> >>
> >> @Tom Lendacky, do you think that it makes sense to harden swiotlb or
> >> some other approach may be better for the SEV features?
> >
> > I am not Tom, but this change seems the right way forward regardless if
> > is TDX, AMD SEV, or any other architecture that encrypt memory and use
> > SWIOTLB.
>
> Sorry, I missed the @Tom before. I'm with Konrad and believe it makes
> sense to add these checks.
>
> I'm not sure if there would be a better approach for all confidential
> computing technologies. SWIOTLB works nicely, but is limited because of
> the 32-bit compatible memory location. Being able to have buffers above
> the 32-bit limit would alleviate that, but that is support that would have
> to be developed.

Funny you mention that.. Dongli (CCed) is working on exactly that and
should be posting his patches the next couple of days.

>
> Thanks,
> Tom
>
> >
> > Let me queue it up in development branch and do some regression testing.
> >>