Re: [PATCH v5 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VM.

From: Michael Roth
Date: Tue Dec 10 2019 - 20:44:00 EST

Quoting Ram Pai (2019-12-06 19:12:39)
> Commit edea902c1c1e ("powerpc/pseries/iommu: Don't use dma_iommu_ops on
> secure guests")
> disabled dma_iommu_ops path, for secure VMs. Disabling dma_iommu_ops
> path for secure VMs, helped enable dma_direct path. This enabled
> support for bounce-buffering through SWIOTLB. However it fails to
> operate when IOMMU is enabled, since I/O pages are not TCE mapped.
> Renable dma_iommu_ops path for pseries Secure VMs. It handles all
> cases including, TCE mapping I/O pages, in the presence of a

Wasn't clear to me at first, but I guess the main gist of this series is
that we want to continue to use SWIOTLB, but also need to create mappings
of it's bounce buffers in the IOMMU, so we revert to using dma_iommu_ops
and rely on the various dma_iommu_{map,alloc}_bypass() hooks throughout
to call into dma_direct_* ops rather than relying on the dma_is_direct(ops)
checks in DMA API functions to do the same.

That makes sense, but one issue I see with that is that
dma_iommu_map_bypass() only tests true if all the following are true:

1) the device requests a 64-bit DMA mask via
2) DDW is enabled (i.e. we don't pass disable_ddw on command-line)

dma_is_direct() checks don't have this limitation, so I think for
anything cases, such as devices that use a smaller DMA mask, we'll
end up falling back to the non-bypass functions in dma_iommu_ops, which
will likely break for things like dma_alloc_coherent/dma_map_single
since they won't use SWIOTLB pages and won't do the necessary calls to
set_memory_unencrypted() to share those non-SWIOTLB buffers with

Maybe that's ok, but I think we should be clearer about how to
fail/handle these cases.

Though I also agree with some concerns Alexey stated earlier: it seems
wasteful to map the entire DDW window just so these bounce buffers can be
mapped. Especially if you consider the lack of a mapping to be an additional
safe-guard against things like buggy device implementations on the QEMU
side. E.g. if we leaked pages to the hypervisor on accident, those pages
wouldn't be immediately accessible to a device, and would still require
additional work get past the IOMMU.

What would it look like if we try to make all this work with disable_ddw passed
to kernel command-line (or forced for is_secure_guest())?

1) dma_iommu_{alloc,map}_bypass() would no longer get us to dma_direct_* ops,
but an additional case or hook that considers is_secure_guest() might do

2) We'd also need to set up an IOMMU mapping for the bounce buffers via
io_tlb_start/io_tlb_end. We could do it once, on-demand via
dma_iommu_bypass_supported() like we do for the 64-bit DDW window, or
maybe in some init function.

That also has the benefit of not requiring devices to support 64-bit DMA.

Alternatively, we could continue to rely on the 64-bit DDW window, but
modify call to enable_ddw() to only map the io_tlb_start/end range in
the case of is_secure_guest(). This is a little cleaner implementation-wise
since we can rely on the existing dma_iommu_{alloc,map}_bypass() hooks, but
devices that don't support 64-bit will fail back to not using dma_direct_* ops
and fail miserably. We'd probably want to handle that more gracefully.

Or we handle both cases gracefully. To me it makes more sense to enable
non-DDW case, then consider adding DDW case later if there's some reason
why 64-bit DMA is needed. But would be good to hear if there are other

> Signed-off-by: Ram Pai <linuxram@xxxxxxxxxx>
> ---
> arch/powerpc/platforms/pseries/iommu.c | 11 +----------
> 1 file changed, 1 insertion(+), 10 deletions(-)
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index 67b5009..4e27d66 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -36,7 +36,6 @@
> #include <asm/udbg.h>
> #include <asm/mmzone.h>
> #include <asm/plpar_wrappers.h>
> -#include <asm/svm.h>
> #include <asm/ultravisor.h>
> #include "pseries.h"
> @@ -1346,15 +1345,7 @@ void iommu_init_early_pSeries(void)
> of_reconfig_notifier_register(&iommu_reconfig_nb);
> register_memory_notifier(&iommu_mem_nb);
> - /*
> - * Secure guest memory is inacessible to devices so regular DMA isn't
> - * possible.
> - *
> - * In that case keep devices' dma_map_ops as NULL so that the generic
> - * DMA code path will use SWIOTLB to bounce buffers for DMA.
> - */
> - if (!is_secure_guest())
> - set_pci_dma_ops(&dma_iommu_ops);
> + set_pci_dma_ops(&dma_iommu_ops);
> }
> static int __init disable_multitce(char *str)
> --