Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

From: Benjamin Herrenschmidt
Date: Tue Apr 18 2017 - 21:27:12 EST

Next message: kbuild test robot: "[rcu:rcu/next 29/29] kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 'srcutorture_get_gp_data'"
Previous message: Zheng, Lv: "RE: [PATCH] ACPICA: Export mutex functions"
In reply to: Jason Gunthorpe: "Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory"
Next in thread: Logan Gunthorpe: "Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, 2017-04-18 at 17:21 -0600, Jason Gunthorpe wrote:
> Splitting the sgl is different from iommu batching.
>
> As an example, an O_DIRECT write of 1 MB with a single 4K P2P page in
> the middle.
>
> The optimum behavior is to allocate a 1MB-4K iommu range and fill it
> with the CPU memory. Then return a SGL with three entires, two
> pointing into the range and one to the p2p.
>
> It is creating each range which tends to be expensive, so creating
> two
> ranges (or worse, if every SGL created a range it would be 255) is
> very undesired.

I think it's easier to get us started to just use a helper and
stick it in the existing sglist processing loop of the architecture.

As we noticed, stacking dma_ops is actually non-trivial and opens quite
the can of worms.

As Jerome mentioned, you can end up with IOs ops containing an sglist
that is a collection of memory and GPU pages for example.

Cheers,
Ben.

Next message: kbuild test robot: "[rcu:rcu/next 29/29] kernel/rcu/rcutorture.c:1369:3: error: implicit declaration of function 'srcutorture_get_gp_data'"
Previous message: Zheng, Lv: "RE: [PATCH] ACPICA: Export mutex functions"
In reply to: Jason Gunthorpe: "Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory"
Next in thread: Logan Gunthorpe: "Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]