Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

From: Oded Gabbay
Date: Wed Jun 23 2021 - 14:43:37 EST


On Wed, Jun 23, 2021 at 9:24 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Wed, Jun 23, 2021 at 10:57:35AM +0200, Christian König wrote:
>
> > > > No it isn't. It makes devices depend on allocating struct pages for their
> > > > BARs which is not necessary nor desired.
> > > Which dramatically reduces the cost of establishing DMA mappings, a
> > > loop of dma_map_resource() is very expensive.
> >
> > Yeah, but that is perfectly ok. Our BAR allocations are either in chunks of
> > at least 2MiB or only a single 4KiB page.
>
> And very small apparently
>
> > > > Allocating a struct pages has their use case, for example for exposing VRAM
> > > > as memory for HMM. But that is something very specific and should not limit
> > > > PCIe P2P DMA in general.
> > > Sure, but that is an ideal we are far from obtaining, and nobody wants
> > > to work on it prefering to do hacky hacky like this.
> > >
> > > If you believe in this then remove the scatter list from dmabuf, add a
> > > new set of dma_map* APIs to work on physical addresses and all the
> > > other stuff needed.
> >
> > Yeah, that's what I totally agree on. And I actually hoped that the new P2P
> > work for PCIe would go into that direction, but that didn't materialized.
>
> It is a lot of work and the only gain is to save a bit of memory for
> struct pages. Not a very big pay off.
>
> > But allocating struct pages for PCIe BARs which are essentially registers
> > and not memory is much more hacky than the dma_resource_map() approach.
>
> It doesn't really matter. The pages are in a special zone and are only
> being used as handles for the BAR memory.
>
> > By using PCIe P2P we want to avoid the round trip to the CPU when one device
> > has filled the ring buffer and another device must be woken up to process
> > it.
>
> Sure, we all have these scenarios, what is inside the memory doesn't
> realy matter. The mechanism is generic and the struct pages don't care
> much if they point at something memory-like or at something
> register-like.
>
> They are already in big trouble because you can't portably use CPU
> instructions to access them anyhow.
>
> Jason

Jason,
Can you please explain why it is so important to (allow) access them
through the CPU ?
In regard to p2p, where is the use-case for that ?
The whole purpose is that the other device accesses my device,
bypassing the CPU.

Thanks,
Oded