Re: [PATCH v9 5/6] drm/panthor: Support sparse mappings

From: Boris Brezillon

Date: Fri Apr 24 2026 - 06:36:54 EST


On Fri, 24 Apr 2026 11:09:27 +0100
Steven Price <steven.price@xxxxxxx> wrote:

> Hi Adrián,
>
> On 22/04/2026 13:25, Adrián Larumbe wrote:
> > Allow UM to bind sparsely populated memory regions by cyclically mapping
> > virtual ranges over a kernel-allocated dummy BO. This alternative is
> > preferable to the old method of handling sparseness in the UMD, because it
> > relied on the creation of a buffer object to the same end, despite the fact
> > Vulkan sparse resources don't need to be backed by a driver BO.
> >
> > The choice of backing sparsely-bound regions with a Panhtor BO was made so
> > as to profit from the existing shrinker reclaim code. That way no special
> > treatment must be given to the dummy sparse BOs when reclaiming memory, as
> > would be the case if we had chosen a raw kernel page implementation.
> >
> > A new dummy BO is allocated per open file context, because even though the
> > Vulkan spec mandates that writes into sparsely bound regions must be
> > discarded, our implementation is still a workaround over the fact Mali CSF
> > GPUs cannot support this behaviour on the hardware level, so writes still
> > make it into the backing BO. If we had a global one, then it could be a
> > venue for information leaks between file contexts, which should never
> > happen in DRM.
> >
> > Reviewed-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx>
>
> Looks good, a few issues below.
>
> I'm worried about remap_evicted_vma() and how that interacts with sparse
> mappings. Does that need to be fixed up to handle sparse mappings? Or is
> there something to prevent the dummy BO being reclaimed? I might be
> missing something here.

Given the sparse mappings still have a vm_bo+gem object attached to them,
I think reclaim is fine, but I'll double check.

> > +static int
> > +panthor_vm_map_sparse(struct panthor_vm *vm, u64 iova, int prot,
> > + struct sg_table *sgt, u64 size)
> > +{
> > + u64 start_iova = iova;
> > + int ret;
> > +
> > + if (iova & (SZ_2M - 1)) {
> > + u64 unaligned_size = min(ALIGN(iova, SZ_2M) - iova, size);
> > +
> > + ret = panthor_vm_map_pages(vm, iova, prot, sgt,
> > + 0, unaligned_size);
> > + if (ret)
> > + return ret;
> > +
> > + size -= unaligned_size;
> > + iova += unaligned_size;
> > + }
> > +
> > + /* TODO: we should probably optimize this at the io_pgtable level. */
> > + while (size > 0) {
> > + u64 next_size = min(size, sg_dma_len(sgt->sgl));
>
> Here we're only using the first entry of the scatter list. So I think in
> the fragmented case we don't end up using the full 2MB.

It should just be

u32 chunk_size = min(size, SZ_2M);

really. The fact the BO is backed by physically contiguous memory
doesn't matter because panthor_vm_map_pages() can cope with that
already.

>
> > +
> > + ret = panthor_vm_map_pages(vm, iova, prot,
> > + sgt, 0, next_size);
> > + if (ret)
> > + goto err_unmap;
> > +
> > + size -= next_size;
> > + iova += next_size;
> > + }

To sum up, the whole thing can be simplified to something like:

static int
panthor_vm_map_sparse(struct panthor_vm *vm, u64 iova, int prot,
struct sg_table *sgt, u64 size)
u64 offset = 0;

while (offset < size) {
u32 chunk_size = min(size - offset, SZ_2M - (iova & (SZ_2M - 1)));

ret = panthor_vm_map_pages(vm, iova + offset, prot,
sgt, 0, chunk_size);
if (ret) {
panthor_vm_unmap_pages(vm, iova, offset);
return ret;
}

offset += chunk_size;
}

return 0;
}