Re: [PATCH] dma-buf: system_heap: Optimize sg_table-to-pages conversion in vmap

From: T.J. Mercier

Date: Tue May 05 2026 - 10:46:59 EST


On Mon, May 4, 2026 at 12:49 AM Christian König
<christian.koenig@xxxxxxx> wrote:
>
> On 5/1/26 17:54, T.J. Mercier wrote:
> > On Thu, Apr 30, 2026 at 9:15 PM Barry Song <baohua@xxxxxxxxxx> wrote:
> >>
> >> On Wed, Apr 22, 2026 at 3:10 PM Christian König
> >> <christian.koenig@xxxxxxx> wrote:
> >>>
> >>> On 4/7/26 13:29, Barry Song wrote:
> >>>> On Tue, Apr 7, 2026 at 3:58 PM Christian König <christian.koenig@xxxxxxx> wrote:
> >>>>>
> >>>>> On 4/6/26 23:49, Barry Song (Xiaomi) wrote:
> >>>>>> From: Xueyuan Chen <Xueyuan.chen21@xxxxxxxxx>
> >>>>>>
> >>>>>> Replace the heavy for_each_sgtable_page() iterator in system_heap_do_vmap()
> >>>>>> with a more efficient nested loop approach.
> >>>>>>
> >>>>>> Instead of iterating page by page, we now iterate through the scatterlist
> >>>>>> entries via for_each_sgtable_sg(). Because pages within a single sg entry
> >>>>>> are physically contiguous, we can populate the page array with a in an
> >>>>>> inner loop using simple pointer math. This save a lot of time.
> >>>>>>
> >>>>>> The WARN_ON check is also pulled out of the loop to save branch
> >>>>>> instructions.
> >>>>>>
> >>>>>> Performance results mapping a 2GB buffer on Radxa O6:
> >>>>>> - Before: ~1440000 ns
> >>>>>> - After: ~232000 ns
> >>>>>> (~84% reduction in iteration time, or ~6.2x faster)
> >>>>>
> >>>>> Well real question is why do you care about the vmap performance?
> >>>>>
> >>>>> That should basically only be used for fbdev emulation (except for VMGFX) and we absolutely don't care about performance there.
> >>>>
> >>>> I agree that in mainline, dma_buf_vmap is not used very often.
> >>>> Here’s what I was able to find:
> >>>>
> >>>> 1 1638 drivers/dma-buf/dma-buf.c <<dma_buf_vmap_unlocked>>
> >>>> ret = dma_buf_vmap(dmabuf, map);
> >>>> 2 376 drivers/gpu/drm/drm_gem_shmem_helper.c
> >>>> <<drm_gem_shmem_vmap_locked>>
> >>>> ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
> >>>> 3 85 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> >>>> <<etnaviv_gem_prime_vmap_impl>>
> >>>> ret = dma_buf_vmap(etnaviv_obj->base.import_attach->dmabuf, &map);
> >>>> 4 433 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c <<map_external>>
> >>>> ret = dma_buf_vmap(bo->tbo.base.dma_buf, map);
> >>>> 5 88 drivers/gpu/drm/vmwgfx/vmwgfx_gem.c <<vmw_gem_vmap>>
> >>>> ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
> >>>>
> >>>> However, in the Android ecosystem, system_heap and similar heaps
> >>>> are widely used across camera, NPU, and media drivers. Many of these
> >>>> drivers are not in mainline but do use vmap() in real code paths.
> >>>
> >>> Well out of tree drivers are not a justification to make an upstream changes.
> >>>
> >>> Apart from a handful of workarounds which need to CPU access as fallback DMA-buf vmap is only used to provide fb dev emulation.
> >>>
> >>> The vmap interface has already given us quite a headache in the first place and there are a couple of unresolved problems regarding synchronization and coherency.
> >>>
> >>> When a driver would be pushed upstream which makes so frequent use of the dma_buf_vmap function that it matters for the performance I think there would be push back on that and the driver developer would require a very good explanation why that is necessary.
> >>>
> >>> So for now I have to reject that patch.
> >>
> >> Well, it doesn’t seem to increase complexity, and the code is quite easy
> >> to understand.
> >
> > I agree with this. This change introduces basically no downsides for
> > upstream, even if it primarily benefits a rare use case. Since
> > dma_buf_vmap is exported for driver use, why not enhance the
> > performance for all callers?
>
> Because we essentially want to restrict the vmap interface to only the fb dev emulation use case and not promote or even expand it.
>
> When this matters performance wise the caller is clearly doing something wrong and by improving the performance we just paper over the issue instead of fixing it.

Ack, I understand your position.

> Regards,
> Christian.
>
> >
> > -T.J.
> >
> >> It would be great if the community could be more welcoming
> >> to developers who are just getting involved, rather than discouraging them.
> >>
> >> Apparently, no one can control whether the source code of those kernel
> >> modules will be upstreamed except the vendors themselves, but products
> >> can still benefit from the common kernel.
> >>
> >> Best Regards
> >> Barry
>