Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
From: Shunsuke Mie
Date: Tue Sep 14 2021 - 06:13:52 EST
2021年9月14日(火) 18:38 Daniel Vetter <daniel.vetter@xxxxxxxx>:
>
> On Tue, Sep 14, 2021 at 9:11 AM Shunsuke Mie <mie@xxxxxxxxxx> wrote:
> >
> > 2021年9月14日(火) 4:23 Daniel Vetter <daniel.vetter@xxxxxxxx>:
> > >
> > > On Fri, Sep 10, 2021 at 3:46 AM Shunsuke Mie <mie@xxxxxxxxxx> wrote:
> > > >
> > > > 2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@xxxxxxxx>:
> > > > >
> > > > > On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > > > > > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > > > > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@xxxxxxx> wrote:
> > > > > > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > > > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@xxxxxxxxxxxxx>:
> > > > > > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > > > > > >>>> Thank you for your comment.
> > > > > > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > > > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > > > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > > > > > >>>>> pass the dma_device here? Something does not add up.
> > > > > > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > > > > > >>>> exporter to know the device buffer constraints of importer.
> > > > > > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&reserved=0
> > > > > > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > > > > > >>> net device.
> > > > > > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > > > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > > > > > >> decided to use the ib device.
> > > > > > > > > Well, that is the whole problem.
> > > > > > > > >
> > > > > > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > > > > > currently are for RXE to acess this non-struct page memory.
> > > > > > > >
> > > > > > > > Yeah, agree that doesn't make much sense.
> > > > > > > >
> > > > > > > > When you want to access the data with the CPU then why do you want to
> > > > > > > > use DMA-buf in the first place?
> > > > > > > >
> > > > > > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > > > > > with an DMA address array and so make the underlying struct page
> > > > > > > > inaccessible for importers.
> > > > > > >
> > > > > > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > > > > > for cpu access. Which intentionally does not require any device. No
> > > > > > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > > > > > support this, but that's fixable, and you must call
> > > > > > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > > > > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > > > > > allowing a wrong device or other fun things.
> > > > > >
> > > > > > Can rxe leave the vmap in place potentially forever?
> > > > >
> > > > > Yeah, it's like perma-pinning the buffer into system memory for
> > > > > non-p2p dma-buf sharing. We just squint and pretend that can't be
> > > > > abused too badly :-) On 32bit you'll run out of vmap space rather
> > > > > quickly, but that's not something anyone cares about here either. We
> > > > > have a bunch of more sw modesetting drivers in drm which use
> > > > > dma_buf_vmap() like this, so it's all fine.
> > > > > -Daniel
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > Thanks for your comments.
> > > >
> > > > In the first place, the CMA region cannot be used for RDMA because the
> > > > region has no struct page. In addition, some GPU drivers use CMA and share
> > > > the region as dma-buf. As a result, RDMA cannot transfer for the region. To
> > > > solve this problem, rxe dma-buf support is better I thought.
> > > >
> > > > I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
> > > > instead of the dma_buf_dynamic_attach().
> > >
> > > btw for next version please cc dri-devel. get_maintainers.pl should
> > > pick it up for these patches.
> > A CC list of these patches is generated by get_maintainers.pl but it
> > didn't pick up the dri-devel. Should I add the dri-devel to the cc
> > manually?
>
> Hm yes, on rechecking the regex doesn't match since you're not
> touching any dma-buf code directly. Or not directly enough for
> get_maintainers.pl to pick it up.
>
> DMA BUFFER SHARING FRAMEWORK
> M: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
> M: Christian König <christian.koenig@xxxxxxx>
> L: linux-media@xxxxxxxxxxxxxxx
> L: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> L: linaro-mm-sig@xxxxxxxxxxxxxxxx (moderated for non-subscribers)
> S: Maintained
> T: git git://anongit.freedesktop.org/drm/drm-misc
> F: Documentation/driver-api/dma-buf.rst
> F: drivers/dma-buf/
> F: include/linux/*fence.h
> F: include/linux/dma-buf*
> F: include/linux/dma-resv.h
> K: \bdma_(?:buf|fence|resv)\b
>
> Above is the MAINTAINERS entry that's always good to cc for anything
> related to dma_buf/fence/resv and any of these related things.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Yes, the dma-buf was not directly included in my changes. However, this is
related to dma-buf. So I'll add the dma-buf related ML and members
to cc using
`./scripts/get_maintainer.pl -f drivers/infiniband/core/umem_dmabuf.c`.
I think it is enough to list the email addresses.
Thank you for letting me know that.
Regards,
Shunsuke,