Re: [PATCH] dma-buf: heaps: Add Coherent heap to dmabuf heaps
From: Maxime Ripard
Date: Thu Feb 26 2026 - 05:14:28 EST
Hi Albert,
Thanks for your patch!
On Tue, Feb 24, 2026 at 08:57:33AM +0100, Albert Esteve wrote:
> Add a dma-buf heap for DT coherent reserved-memory
> (i.e., 'shared-dma-pool' without 'reusable' property),
> exposing one heap per region for userspace buffers.
>
> The heap binds a synthetic platform device to each region
> so coherent allocations use the correct dev->dma_mem,
> and it defers registration until late_initcall when
> normal allocator are available.
>
> This patch includes charging of the coherent heap
> allocator to the dmem cgroup.
>
> Signed-off-by: Albert Esteve <aesteve@xxxxxxxxxx>
> ---
> This patch introduces a new driver to expose DT coherent reserved-memory
> regions as dma-buf heaps, allowing userspace buffers to be created.
>
> Since these regions are device-dependent, we bind a synthetic platform
> device to each region so coherent allocations use the correct dev->dma_mem.
>
> Following Eric’s [1] and Maxime’s [2] work on charging DMA buffers
> allocated from userspace to cgroups (dmem), this patch adds the same
> charging pattern used by CMA heaps patch. Charging is done only through
> the dma-buf heap interface so it can be attributed to a userspace allocator.
>
> This allows each device-specific reserved-memory region to enforce its
> own limits.
>
> [1] https://lore.kernel.org/all/20260218-dmabuf-heap-cma-dmem-v2-0-b249886fb7b2@xxxxxxxxxx/
> [2] https://lore.kernel.org/all/20250310-dmem-cgroups-v1-0-2984c1bc9312@xxxxxxxxxx/
> ---
> drivers/dma-buf/heaps/Kconfig | 17 ++
> drivers/dma-buf/heaps/Makefile | 1 +
> drivers/dma-buf/heaps/coherent_heap.c | 485 ++++++++++++++++++++++++++++++++++
> include/linux/dma-heap.h | 11 +
> kernel/dma/coherent.c | 9 +
> 5 files changed, 523 insertions(+)
>
> diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
> index a5eef06c42264..93765dca164e3 100644
> --- a/drivers/dma-buf/heaps/Kconfig
> +++ b/drivers/dma-buf/heaps/Kconfig
> @@ -12,3 +12,20 @@ config DMABUF_HEAPS_CMA
> Choose this option to enable dma-buf CMA heap. This heap is backed
> by the Contiguous Memory Allocator (CMA). If your system has these
> regions, you should say Y here.
> +
> +config DMABUF_HEAPS_COHERENT
> + bool "DMA-BUF Coherent Reserved-Memory Heap"
> + depends on DMABUF_HEAPS && OF_RESERVED_MEM && DMA_DECLARE_COHERENT
> + help
> + Choose this option to enable coherent reserved-memory dma-buf heaps.
> + This heap is backed by non-reusable DT "shared-dma-pool" regions.
> + If your system defines coherent reserved-memory regions, you should
> + say Y here.
> +
> +config COHERENT_AREAS_DEFERRED
> + int "Max deferred coherent reserved-memory regions"
> + depends on DMABUF_HEAPS_COHERENT
> + default 16
> + help
> + Maximum number of coherent reserved-memory regions that can be
> + deferred for later registration during early boot.
> diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
> index 974467791032f..96bda7a65f041 100644
> --- a/drivers/dma-buf/heaps/Makefile
> +++ b/drivers/dma-buf/heaps/Makefile
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o
> obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o
> +obj-$(CONFIG_DMABUF_HEAPS_COHERENT) += coherent_heap.o
> diff --git a/drivers/dma-buf/heaps/coherent_heap.c b/drivers/dma-buf/heaps/coherent_heap.c
> new file mode 100644
> index 0000000000000..870b2b89aefcb
> --- /dev/null
> +++ b/drivers/dma-buf/heaps/coherent_heap.c
> @@ -0,0 +1,485 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * DMABUF heap for coherent reserved-memory regions
> + *
> + * Copyright (C) 2026 Red Hat, Inc.
> + * Author: Albert Esteve <aesteve@xxxxxxxxxx>
> + *
> + */
> +
> +#include <linux/cgroup_dmem.h>
> +#include <linux/dma-heap.h>
> +#include <linux/dma-buf.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/highmem.h>
> +#include <linux/iosys-map.h>
> +#include <linux/of_reserved_mem.h>
> +#include <linux/platform_device.h>
> +#include <linux/scatterlist.h>
> +#include <linux/slab.h>
> +#include <linux/vmalloc.h>
> +
> +#define DEFERRED_AREAS_MAX CONFIG_COHERENT_AREAS_DEFERRED
I'm not sure we need to make it configurable. Distros are going to set
it to the user with the highest number of regions anyway. How about
using MAX_RESERVED_REGIONS for now?
>
> [...]
>
> +struct coherent_heap {
> + struct dma_heap *heap;
> + struct reserved_mem *rmem;
> + char *name;
> + struct device *dev;
> + struct platform_device *pdev;
> +#if IS_ENABLED(CONFIG_CGROUP_DMEM)
> + struct dmem_cgroup_region *cg;
> +#endif
We might want to leave the dmem accounting out for now so we can focus
on the heap itself. That being said, it ended up being pretty trivial
for CMA, so maybe it's not too much of a concern.
>
> [...]
>
> +static int __coherent_heap_register(struct reserved_mem *rmem)
> +{
> + struct dma_heap_export_info exp_info;
> + struct coherent_heap *coh_heap;
> +#if IS_ENABLED(CONFIG_CGROUP_DMEM)
> + struct dmem_cgroup_region *region;
> +#endif
> + const char *rmem_name;
> + int ret;
> +
> + if (!rmem)
> + return -EINVAL;
> +
> + rmem_name = rmem->name ? rmem->name : "unknown";
If the reserved region has no name, we probably shouldn't expose it to
userspace at all. Using unknown will probably create some bugs if you
have several, but also it's pretty like to have a name at some point and
thus we wouldn't have a stable name for the uAPI.
> + coh_heap = kzalloc_obj(*coh_heap);
> + if (!coh_heap)
> + return -ENOMEM;
> +
> + coh_heap->name = kasprintf(GFP_KERNEL, "coherent_%s", rmem_name);
> + if (!coh_heap->name) {
> + ret = -ENOMEM;
> + goto free_coherent_heap;
> + }
Similarly, we shouldn't use the coherent_ prefix for the heap name. If
the backing allocator ever changes (and between contiguous and coherent,
the difference is just a single property value in the DT), then the name
would change and userspace would break. We should directly use the name
of the region here.
> + coh_heap->rmem = rmem;
> +
> + /* create a platform device per rmem and bind it */
> + coh_heap->pdev = platform_device_register_simple("coherent-heap",
> + PLATFORM_DEVID_AUTO,
> + NULL, 0);
> + if (IS_ERR(coh_heap->pdev)) {
> + ret = PTR_ERR(coh_heap->pdev);
> + goto free_name;
> + }
We probably should use a faux_device here instead of a platform_device,
but more importantly, the heap itself already has a device allocated for
it (dev_ret in dma_heap_add).
It's not in struct dma_heap yet, but there's a patch that moves it there
that we should probably carry:
https://lore.kernel.org/r/20210120210937.15069-2-john.stultz@xxxxxxxxxx/
> + if (rmem->ops && rmem->ops->device_init) {
> + ret = rmem->ops->device_init(rmem, &coh_heap->pdev->dev);
> + if (ret)
> + goto pdev_unregister;
> + }
I'm not really a fan of calling ops directly. Maybe we should create an
of_reserved_mem_device_init_with_mem function that would do it for us
(and would be called by of_reserved_mem_device_init_by_idx and the
likes).
> + coh_heap->dev = &coh_heap->pdev->dev;
> +#if IS_ENABLED(CONFIG_CGROUP_DMEM)
> + region = dmem_cgroup_register_region(rmem->size, "coh/%s", rmem_name);
> + if (IS_ERR(region)) {
> + ret = PTR_ERR(region);
> + goto pdev_unregister;
> + }
> + coh_heap->cg = region;
> +#endif
Same comment than for CMA here: it should really be created by the
coherent memory region itself.
> + exp_info.name = coh_heap->name;
> + exp_info.ops = &coherent_heap_ops;
> + exp_info.priv = coh_heap;
> +
> + coh_heap->heap = dma_heap_add(&exp_info);
> + if (IS_ERR(coh_heap->heap)) {
> + ret = PTR_ERR(coh_heap->heap);
> + goto cg_unregister;
> + }
> +
> + return 0;
> +
> +cg_unregister:
> +#if IS_ENABLED(CONFIG_CGROUP_DMEM)
> + dmem_cgroup_unregister_region(coh_heap->cg);
> +#endif
> +pdev_unregister:
> + platform_device_unregister(coh_heap->pdev);
> + coh_heap->pdev = NULL;
> +free_name:
> + kfree(coh_heap->name);
> +free_coherent_heap:
> + kfree(coh_heap);
> +
> + return ret;
> +}
> +
> +int dma_heap_coherent_register(struct reserved_mem *rmem)
> +{
> + int ret;
> +
> + ret = __coherent_heap_register(rmem);
> + if (ret == -ENOMEM)
> + return coherent_heap_add_deferred(rmem);
> + return ret;
> +}
I appreciate you did it like we did for CMA, but if we ever want to make
that heap a module you'll end up with a dependency from the core kernel
on a module which doesn't work. The best here might be to wait a bit
until we have somewhat of an agreement on
https://lore.kernel.org/r/20260225-dma-buf-heaps-as-modules-v1-0-2109225a090d@xxxxxxxxxx
> +static int __init coherent_heap_register_deferred(void)
> +{
> + unsigned int i;
> + int ret;
> +
> + for (i = 0; i < rmem_areas_deferred_num; i++) {
> + struct reserved_mem *rmem = rmem_areas_deferred[i];
> +
> + ret = __coherent_heap_register(rmem);
> + if (ret) {
> + pr_warn("Failed to add coherent heap %s",
> + rmem->name ? rmem->name : "unknown");
> + continue;
> + }
> + }
> +
> + return 0;
> +}
> +late_initcall(coherent_heap_register_deferred);
Why do you need a late_initcall here? Isn't module_init enough?
> +MODULE_DESCRIPTION("DMA-BUF heap for coherent reserved-memory regions");
> diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
> index 648328a64b27e..e894cfa1ecf1a 100644
> --- a/include/linux/dma-heap.h
> +++ b/include/linux/dma-heap.h
> @@ -9,9 +9,11 @@
> #ifndef _DMA_HEAPS_H
> #define _DMA_HEAPS_H
>
> +#include <linux/errno.h>
> #include <linux/types.h>
>
> struct dma_heap;
> +struct reserved_mem;
>
> /**
> * struct dma_heap_ops - ops to operate on a given heap
> @@ -48,4 +50,13 @@ struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info);
>
> extern bool mem_accounting;
>
> +#if IS_ENABLED(CONFIG_DMABUF_HEAPS_COHERENT)
> +int dma_heap_coherent_register(struct reserved_mem *rmem);
> +#else
> +static inline int dma_heap_coherent_register(struct reserved_mem *rmem)
> +{
> + return -EOPNOTSUPP;
> +}
> +#endif
> +
> #endif /* _DMA_HEAPS_H */
> diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
> index 1147497bc512c..f49d13e460e4b 100644
> --- a/kernel/dma/coherent.c
> +++ b/kernel/dma/coherent.c
> @@ -9,6 +9,7 @@
> #include <linux/module.h>
> #include <linux/dma-direct.h>
> #include <linux/dma-map-ops.h>
> +#include <linux/dma-heap.h>
>
> struct dma_coherent_mem {
> void *virt_base;
> @@ -393,6 +394,14 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem)
> rmem->ops = &rmem_dma_ops;
> pr_info("Reserved memory: created DMA memory pool at %pa, size %ld MiB\n",
> &rmem->base, (unsigned long)rmem->size / SZ_1M);
> +
> + if (IS_ENABLED(CONFIG_DMABUF_HEAPS_COHERENT)) {
> + int ret = dma_heap_coherent_register(rmem);
> +
> + if (ret)
> + pr_warn("Reserved memory: failed to register coherent heap for %s (%d)\n",
> + rmem->name ? rmem->name : "unknown", ret);
> + }
I think this should be split into a patch of its own. It's going to be
reviewed (and possibly merged) by another maintainer, through another
tree.
Maxime
Attachment:
signature.asc
Description: PGP signature