Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
From: Danilo Krummrich
Date: Tue Sep 12 2023 - 06:09:27 EST
On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
> Hi, Danilo
>
> On 9/11/23 19:49, Danilo Krummrich wrote:
> > Hi Thomas,
> >
> > On 9/11/23 19:19, Thomas Hellström wrote:
> > > Hi, Danilo
> > >
> > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > list of
> > > > mappings of this GEM object.
> > > >
> > > > This has multiple advantages:
> > > >
> > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > > of the drm_gpuvm. This is useful for tracking external and evicted
> > > > objects per VM, which is introduced in subsequent patches.
> > > >
> > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > > drm_gpuvm becomes much cheaper.
> > > >
> > > > 3) Drivers can derive and extend the structure to easily represent
> > > > driver specific states of a BO for a certain GPUVM.
> > > >
> > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > credit for
> > > > this idea goes to the developers of amdgpu.
> > > >
> > > > Cc: Christian König <christian.koenig@xxxxxxx>
> > > > Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>
> > >
> > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > their own bo definition? I figure that would mean using the gem bo's
> > > refcounting and providing a helper to call from the driver's bo
> > > release. Looks like that could potentially save a lot of code? Or is
> > > there something that won't work with that approach?
> >
> > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > callback for drivers to register for that purpose.
> >
> > - Danilo
>
> Now after looking a bit deeper, I think actually the question could be
> rephrased as, why don't we just use the
> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> keeping things simple? Drivers would then just embed it in their bo subclass
> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> that don't do VM_BIND yet.
struct drm_gem_object::gpuva is just a container containing a list in order to
(currently) attach drm_gpuva structs to it and with this patch attach
drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
the driver specific BO structure". Having a common connection between GEM
objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
patch series however.
>
> Sure, this won't be per bo and per vm, but it'd really only make a slight
> difference where we have multiple VMAs per bo, where per-vm per-bo state
> either needs to be duplicated or attached to a single vma (as in the case of
> the external bo list).
Correct, one implication is that we don't get a per VM and BO abstraction, and
hence are left with a list of all drm_gpuva structs having the same backing BO,
regardless of the VM.
For amdgpu this was always a concern. Now that we want to keep track of external
and evicted objects it's going to be a concern for most drivers I guess. Because
the only structure we could use for tracking external and evicted objects we are
left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
structure isn't unique and we need to consider cases where userspace just
allocates rather huge BOs and creates tons of mappings from it. Running the full
list of drm_gpuva structs (with even the ones from other VMs included) for
adding an external or evicted object isn't very efficient. Not to mention that
the maintenance when the mapping we've (randomly) picked as an entry for the
external/evicted object list is unmapped, but there are still mappings left in
the VM with the same backing BO.
Now, a way to get rid of the VM_BO abstraction would be to use maple trees
instead, since then we can store drm_gem_object structs directly for each VM.
However, Xe had concerns about using maple trees and preferred lists, plus
having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
having a VM_BO abstraction enabling us to track external/evicted objects with
lists seems to satisfy everyone's needs.
- Danilo
>
> To me that looks like a substantial amount of less code / complexity?
>
> /Thomas
>
>
> >
> > >
> > > Thanks,
> > >
> > > Thomas
> > >
> > >
> >
>