[RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma
From: jglisse
Date: Tue Jan 29 2019 - 12:47:49 EST
From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
Allow mmap of device file to export device memory to peer to peer
devices. This will allow for instance a network device to access a
GPU memory or to access a storage device queue directly.
The common case will be a vma created by userspace device driver
that is then share to another userspace device driver which call
in its kernel device driver to map that vma.
The vma does not need to have any valid CPU mapping so that only
peer to peer device might access its content. Or it could have
valid CPU mapping too in that case it should point to same memory
for consistency.
Note that peer to peer mapping is highly platform and device
dependent and it might not work in all the cases. However we do
expect supports for this to grow on more hardware platform.
This patch only adds new call backs to vm_operations_struct bulk
of code light within common bus driver (like pci) and device
driver (both the exporting and importing device).
Current design mandate that the importer must obey mmu_notifier
and invalidate any peer to peer mapping anytime a notification
of invalidation happens for a range that have been peer to peer
mapped. This allows exporter device to easily invalidate mapping
for any importer device.
Signed-off-by: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
Cc: Logan Gunthorpe <logang@xxxxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
Cc: Christian Koenig <christian.koenig@xxxxxxx>
Cc: Felix Kuehling <Felix.Kuehling@xxxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: linux-pci@xxxxxxxxxxxxxxx
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
Cc: Robin Murphy <robin.murphy@xxxxxxx>
Cc: Joerg Roedel <jroedel@xxxxxxx>
Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
---
include/linux/mm.h | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb6408fe73..1bd60a90e575 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -429,6 +429,44 @@ struct vm_operations_struct {
pgoff_t start_pgoff, pgoff_t end_pgoff);
unsigned long (*pagesize)(struct vm_area_struct * area);
+ /*
+ * Optional for device driver that want to allow peer to peer (p2p)
+ * mapping of their vma (which can be back by some device memory) to
+ * another device.
+ *
+ * Note that the exporting device driver might not have map anything
+ * inside the vma for the CPU but might still want to allow a peer
+ * device to access the range of memory corresponding to a range in
+ * that vma.
+ *
+ * FOR PREDICTABILITY IF DRIVER SUCCESSFULY MAP A RANGE ONCE FOR A
+ * DEVICE THEN FURTHER MAPPING OF THE SAME IF THE VMA IS STILL VALID
+ * SHOULD ALSO BE SUCCESSFUL. Following this rule allow the importing
+ * device to map once during setup and report any failure at that time
+ * to the userspace. Further mapping of the same range might happen
+ * after mmu notifier invalidation over the range. The exporting device
+ * can use this to move things around (defrag BAR space for instance)
+ * or do other similar task.
+ *
+ * IMPORTER MUST OBEY mmu_notifier NOTIFICATION AND CALL p2p_unmap()
+ * WHEN A NOTIFIER IS CALL FOR THE RANGE ! THIS CAN HAPPEN AT ANY
+ * POINT IN TIME WITH NO LOCK HELD.
+ *
+ * In below function, the device argument is the importing device,
+ * the exporting device is the device to which the vma belongs.
+ */
+ long (*p2p_map)(struct vm_area_struct *vma,
+ struct device *device,
+ unsigned long start,
+ unsigned long end,
+ dma_addr_t *pa,
+ bool write);
+ long (*p2p_unmap)(struct vm_area_struct *vma,
+ struct device *device,
+ unsigned long start,
+ unsigned long end,
+ dma_addr_t *pa);
+
/* notification that a previously read-only page is about to become
* writable, if an error is returned it will cause a SIGBUS */
vm_fault_t (*page_mkwrite)(struct vm_fault *vmf);
--
2.17.2