Re: [PATCH hyperv-next 5/6] arch, drivers: Add device struct bitfield to not bounce-buffer

From: Roman Kisel
Date: Wed Apr 09 2025 - 12:48:49 EST




On 4/9/2025 9:03 AM, Robin Murphy wrote:
On 2025-04-09 1:08 am, Roman Kisel wrote:
Bounce-buffering makes the system spend more time copying
I/O data. When the I/O transaction take place between
a confidential and a non-confidential endpoints, there is
no other way around.

Introduce a device bitfield to indicate that the device
doesn't need to perform bounce buffering. The capable
device may employ it to save on copying data around.

It's not so much about bounce buffering, it's more fundamentally about whether the device is trusted and able to access private memory at all or not. And performance is hardly the biggest concern either - if you do trust a device to operate on confidential data in private memory, then surely it is crucial to actively *prevent* that data ever getting into shared SWIOTLB pages where anyone else could also get at it. At worst that means CoCo VMs might need an *additional* non-shared SWIOTLB to support trusted devices with addressing limitations (and/or "swiotlb=force" debugging, potentially).

Thanks, I should've highlighted that facet most certainly!


Also whatever we do for this really wants to tie in with the nascent TDISP stuff as well, since we definitely don't want to end up with more than one notion of whether a device is in a trusted/locked/private/etc. vs. unlocked/shared/etc. state with respect to DMA (or indeed anything else if we can avoid it).

Wouldn't TDISP be per-device as well? In which case, a flag would be
needed just as being added in this patch.

Although, there must be a difference between a device with TDISP where
the flag would be the indication of the feature, and this code where the
driver may flip that back and forth...

Do you feel this is shoehorned in `struct device`? I couldn't find an
appropriate private (== opaque pointer) part in the structure to store
that bit (`struct device_private` wouldn't fit the bill) and looked like
adding it to the struct itself would do no harm. However, my read of the
room is that folks see that as dubious :)

What would be your opinion on where to store that flag to tie together
its usage in the Hyper-V SCSI and not bounce-buffering?


Thanks,
Robin.

Signed-off-by: Roman Kisel <romank@xxxxxxxxxxxxxxxxxxx>
---
  arch/x86/mm/mem_encrypt.c  | 3 +++
  include/linux/device.h     | 8 ++++++++
  include/linux/dma-direct.h | 3 +++
  include/linux/swiotlb.h    | 3 +++
  4 files changed, 17 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 95bae74fdab2..6349a02a1da3 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -19,6 +19,9 @@
  /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
  bool force_dma_unencrypted(struct device *dev)
  {
+    if (dev->use_priv_pages_for_io)
+        return false;
+
      /*
       * For SEV, all DMA must be to unencrypted addresses.
       */
diff --git a/include/linux/device.h b/include/linux/device.h
index 80a5b3268986..4aa4a6fd9580 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -725,6 +725,8 @@ struct device_physical_location {
   * @dma_skip_sync: DMA sync operations can be skipped for coherent buffers.
   * @dma_iommu: Device is using default IOMMU implementation for DMA and
   *        doesn't rely on dma_ops structure.
+ * @use_priv_pages_for_io: Device is using private pages for I/O, no need to
+ *        bounce-buffer.
   *
   * At the lowest level, every device in a Linux system is represented by an
   * instance of struct device. The device structure contains the information
@@ -843,6 +845,7 @@ struct device {
  #ifdef CONFIG_IOMMU_DMA
      bool            dma_iommu:1;
  #endif
+    bool            use_priv_pages_for_io:1;
  };
  /**
@@ -1079,6 +1082,11 @@ static inline bool dev_removable_is_valid(struct device *dev)
      return dev->removable != DEVICE_REMOVABLE_NOT_SUPPORTED;
  }
+static inline bool dev_priv_pages_for_io(struct device *dev)
+{
+    return dev->use_priv_pages_for_io;
+}
+
  /*
   * High level routines for use by the bus drivers
   */
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index d7e30d4f7503..b096369f847e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -94,6 +94,9 @@ static inline dma_addr_t phys_to_dma_unencrypted(struct device *dev,
   */
  static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
+    if (dev_priv_pages_for_io(dev))
+        return phys_to_dma_unencrypted(dev, paddr);
+
      return __sme_set(phys_to_dma_unencrypted(dev, paddr));
  }
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3dae0f592063..35ee10641b42 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -173,6 +173,9 @@ static inline bool is_swiotlb_force_bounce(struct device *dev)
  {
      struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+    if (dev_priv_pages_for_io(dev))
+        return false;
+
      return mem && mem->force_bounce;
  }


--
Thank you,
Roman