Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

From: Jason Gunthorpe
Date: Tue Jun 22 2021 - 08:15:53 EST


On Tue, Jun 22, 2021 at 03:04:30PM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 3:01 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >
> > On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> > > <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
> > > >
> > > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > > >
> > > > >> Another thing I want to emphasize is that we are doing p2p only
> > > > >> through the export/import of the FD. We do *not* allow the user to
> > > > >> mmap the dma-buf as we do not support direct IO. So there is no access
> > > > >> to these pages through the userspace.
> > > > > Arguably mmaping the memory is a better choice, and is the direction
> > > > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > > > designed to allow hitless revokation of the memory, which this isn't
> > > > > even using.
> > > >
> > > > The major problem with this approach is that DMA-buf is also used for
> > > > memory which isn't CPU accessible.
> >
> > That isn't an issue here because the memory is only intended to be
> > used with P2P transfers so it must be CPU accessible.
> >
> > > > That was one of the reasons we didn't even considered using the mapping
> > > > memory approach for GPUs.
> >
> > Well, now we have DEVICE_PRIVATE memory that can meet this need
> > too.. Just nobody has wired it up to hmm_range_fault()
> >
> > > > > So you are taking the hit of very limited hardware support and reduced
> > > > > performance just to squeeze into DMABUF..
> > >
> > > Thanks Jason for the clarification, but I honestly prefer to use
> > > DMA-BUF at the moment.
> > > It gives us just what we need (even more than what we need as you
> > > pointed out), it is *already* integrated and tested in the RDMA
> > > subsystem, and I'm feeling comfortable using it as I'm somewhat
> > > familiar with it from my AMD days.
> >
> > You still have the issue that this patch is doing all of this P2P
> > stuff wrong - following the already NAK'd AMD approach.
>
> Could you please point me exactly to the lines of code that are wrong
> in your opinion ?

1) Setting sg_page to NULL
2) 'mapping' pages for P2P DMA without going through the iommu
3) Allowing P2P DMA without using the p2p dma API to validate that it
can work at all in the first place.

All of these result in functional bugs in certain system
configurations.

Jason