Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination

From: Thomas Hellström
Date: Tue Sep 12 2023 - 06:35:29 EST



On 9/12/23 12:06, Danilo Krummrich wrote:
On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
Hi, Danilo

On 9/11/23 19:49, Danilo Krummrich wrote:
Hi Thomas,

On 9/11/23 19:19, Thomas Hellström wrote:
Hi, Danilo

On 9/9/23 17:31, Danilo Krummrich wrote:
This patch adds an abstraction layer between the drm_gpuva mappings of
a particular drm_gem_object and this GEM object itself. The abstraction
represents a combination of a drm_gem_object and drm_gpuvm. The
drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
representing this abstraction), while each drm_gpuvm_bo contains
list of
mappings of this GEM object.

This has multiple advantages:

1) We can use the drm_gpuvm_bo structure to attach it to various lists
    of the drm_gpuvm. This is useful for tracking external and evicted
    objects per VM, which is introduced in subsequent patches.

2) Finding mappings of a certain drm_gem_object mapped in a certain
    drm_gpuvm becomes much cheaper.

3) Drivers can derive and extend the structure to easily represent
    driver specific states of a BO for a certain GPUVM.

The idea of this abstraction was taken from amdgpu, hence the
credit for
this idea goes to the developers of amdgpu.

Cc: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>
Did you consider having the drivers embed the struct drm_gpuvm_bo in
their own bo definition? I figure that would mean using the gem bo's
refcounting and providing a helper to call from the driver's bo
release. Looks like that could potentially save a lot of code? Or is
there something that won't work with that approach?
There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
callback for drivers to register for that purpose.

- Danilo
Now after looking a bit deeper, I think actually the question could be
rephrased as, why don't we just use the
struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
keeping things simple? Drivers would then just embed it in their bo subclass
and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
that don't do VM_BIND yet.
struct drm_gem_object::gpuva is just a container containing a list in order to
(currently) attach drm_gpuva structs to it and with this patch attach
drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
the driver specific BO structure". Having a common connection between GEM
objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
patch series however.

Sure, this won't be per bo and per vm, but it'd really only make a slight
difference where we have multiple VMAs per bo, where per-vm per-bo state
either needs to be duplicated or attached to a single vma (as in the case of
the external bo list).

Correct, one implication is that we don't get a per VM and BO abstraction, and
hence are left with a list of all drm_gpuva structs having the same backing BO,
regardless of the VM.

For amdgpu this was always a concern. Now that we want to keep track of external
and evicted objects it's going to be a concern for most drivers I guess. Because
the only structure we could use for tracking external and evicted objects we are
left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
structure isn't unique and we need to consider cases where userspace just
allocates rather huge BOs and creates tons of mappings from it. Running the full
list of drm_gpuva structs (with even the ones from other VMs included) for
adding an external or evicted object isn't very efficient. Not to mention that
the maintenance when the mapping we've (randomly) picked as an entry for the
external/evicted object list is unmapped, but there are still mappings left in
the VM with the same backing BO.
For the evicted object it's not much of an issue; we maintain a list of vmas needing rebinding for each VM rather than objects evicted, so there is no or very little additional overhead there. The extobj list is indeed a problem if many VMAs are bound to the same bo. Not that the code snippets are complicated, but the list traversals would be excessive.

Now, a way to get rid of the VM_BO abstraction would be to use maple trees
instead, since then we can store drm_gem_object structs directly for each VM.
However, Xe had concerns about using maple trees and preferred lists, plus
having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
having a VM_BO abstraction enabling us to track external/evicted objects with
lists seems to satisfy everyone's needs.

Indeed this is a tradeoff between a simple implementation that is OK for situations with not many VMs nor VMAs per bo vs a more complex implementation that optimizes for the opposite case.

So if this latter is a case we need to optimize for at this point then I guess it's the way to go.
(I'm in the process of adapting the xe driver to this, so I just wanted to bring up areas where the implementations differ quite a lot and make sure options are discussed).

Thanks,

Thomas



- Danilo

To me that looks like a substantial amount of less code / complexity?

/Thomas


Thanks,

Thomas