Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

From: Dan Williams
Date: Sat Apr 15 2017 - 18:09:45 EST


On Sat, Apr 15, 2017 at 10:41 AM, Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
> Thanks, Benjamin, for the summary of some of the issues.
>
> On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
>> So I assume the p2p code provides a way to address that too via special
>> dma_ops ? Or wrappers ?
>
> Not at this time. We will probably need a way to ensure the iommus do
> not attempt to remap these addresses. Though if it does, I'd expect
> everything would still work you just wouldn't get the performance or
> traffic flow you are looking for. We've been testing with the software
> iommu which doesn't have this problem.
>
>> The problem is that the latter while seemingly easier, is also slower
>> and not supported by all platforms and architectures (for example,
>> POWER currently won't allow it, or rather only allows a store-only
>> subset of it under special circumstances).
>
> Yes, I think situations where we have to cross host bridges will remain
> unsupported by this work for a long time. There are two many cases where
> it just doesn't work or it performs too poorly to be useful.
>
>> I don't fully understand how p2pmem "solves" that by creating struct
>> pages. The offset problem is one issue. But there's the iommu issue as
>> well, the driver cannot just use the normal dma_map ops.
>
> We are not using a proper iommu and we are dealing with systems that
> have zero offset. This case is also easily supported. I expect fixing
> the iommus to not map these addresses would also be reasonably achievable.

I'm wondering, since this is limited to support behind a single
switch, if you could have a software-iommu hanging off that switch
device object that knows how to catch and translate the non-zero
offset bus address case. We have something like this with VMD driver,
and I toyed with a soft pci bridge when trying to support AHCI+NVME
bar remapping. When the dma api looks up the iommu for its device it
hits this soft-iommu and that driver checks if the page is host memory
or device memory to do the dma translation. You wouldn't need a bit in
struct page, just a lookup to the hosting struct dev_pagemap in the
is_zone_device_page() case and that can point you to p2p details.