Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

From: Jerome Glisse
Date: Thu Jan 31 2019 - 10:11:52 EST


On Thu, Jan 31, 2019 at 09:05:01AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 30, 2019 at 08:44:20PM +0000, Jason Gunthorpe wrote:
> > Not really, for MRs most drivers care about DMA addresses only. The
> > only reason struct page ever gets involved is because it is part of
> > the GUP, SGL and dma_map family of APIs.
>
> And the only way you get the DMA address is through the dma mapping
> APIs. Which except for the little oddball dma_map_resource expect
> a struct page in some form. And dma_map_resource isn't really up
> to speed for full blown P2P.
>
> Now we could and maybe eventually should change all this. But that
> is a pre-requisitive for doing anything more fancy, and not something
> to be hacked around.
>
> > O_DIRECT seems to be the justification for struct page, but nobody is
> > signing up to make O_DIRECT have the required special GUP/SGL/P2P flow
> > that would be needed to *actually* make that work - so it really isn't
> > a justification today.
>
> O_DIRECT is just the messenger. Anything using GUP will need a struct
> page, which is all our interfaces that do I/O directly to user pages.

I do not want to allow GUP to pin I/O space this would open a pandora
box that we do not want to open at all. Many driver manage their IO
space and if they get random pinning because some other kernel bits
they never heard of starts to do GUP on their stuff it is gonna cause
havoc.

So far mmap of device file have always been special and it has been
reflected to userspace in all the instance i know of (media and GPU).
Pretending we can handle them like any other vma is a lie because
they were never designed that way in the first place and it would be
disruptive to all those driver.

Minimum disruption with minimun changes is what we should aim for and
is what i am trying to do with this patchset. Using struct page and
allowing GUP would mean rewritting huge chunk of GPU drivers (pretty
much rewritting their whole memory management) with no benefit at the
end.

When something is special it is better to leave it that way.

Cheers,
Jérôme