Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

From: Jason Gunthorpe
Date: Tue Jun 22 2021 - 08:01:51 EST


On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 9:37 AM Christian König
> <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
> >
> > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > >
> > >> Another thing I want to emphasize is that we are doing p2p only
> > >> through the export/import of the FD. We do *not* allow the user to
> > >> mmap the dma-buf as we do not support direct IO. So there is no access
> > >> to these pages through the userspace.
> > > Arguably mmaping the memory is a better choice, and is the direction
> > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > designed to allow hitless revokation of the memory, which this isn't
> > > even using.
> >
> > The major problem with this approach is that DMA-buf is also used for
> > memory which isn't CPU accessible.

That isn't an issue here because the memory is only intended to be
used with P2P transfers so it must be CPU accessible.

> > That was one of the reasons we didn't even considered using the mapping
> > memory approach for GPUs.

Well, now we have DEVICE_PRIVATE memory that can meet this need
too.. Just nobody has wired it up to hmm_range_fault()

> > > So you are taking the hit of very limited hardware support and reduced
> > > performance just to squeeze into DMABUF..
>
> Thanks Jason for the clarification, but I honestly prefer to use
> DMA-BUF at the moment.
> It gives us just what we need (even more than what we need as you
> pointed out), it is *already* integrated and tested in the RDMA
> subsystem, and I'm feeling comfortable using it as I'm somewhat
> familiar with it from my AMD days.

You still have the issue that this patch is doing all of this P2P
stuff wrong - following the already NAK'd AMD approach.

> I'll go and read Logan's patch-set to see if that will work for us in
> the future. Please remember, as Daniel said, we don't have struct page
> backing our device memory, so if that is a requirement to connect to
> Logan's work, then I don't think we will want to do it at this point.

It is trivial to get the struct page for a PCI BAR.

Jason