On Thu, Apr 13, 2023 at 01:58:34PM +0100, Tvrtko Ursulin wrote:
On 12/04/2023 20:18, Daniel Vetter wrote:
On Wed, Apr 12, 2023 at 11:42:07AM -0700, Rob Clark wrote:
On Wed, Apr 12, 2023 at 11:17 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
On Wed, Apr 12, 2023 at 10:59:54AM -0700, Rob Clark wrote:
On Wed, Apr 12, 2023 at 7:42 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
On 11/04/2023 23:56, Rob Clark wrote:
From: Rob Clark <robdclark@xxxxxxxxxxxx>
Add support to dump GEM stats to fdinfo.
v2: Fix typos, change size units to match docs, use div_u64
v3: Do it in core
Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
Reviewed-by: Emil Velikov <emil.l.velikov@xxxxxxxxx>
---
Documentation/gpu/drm-usage-stats.rst | 21 ++++++++
drivers/gpu/drm/drm_file.c | 76 +++++++++++++++++++++++++++
include/drm/drm_file.h | 1 +
include/drm/drm_gem.h | 19 +++++++
4 files changed, 117 insertions(+)
diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..b5e7802532ed 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -105,6 +105,27 @@ object belong to this client, in the respective memory region.
Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
indicating kibi- or mebi-bytes.
+- drm-shared-memory: <uint> [KiB|MiB]
+
+The total size of buffers that are shared with another file (ie. have more
+than a single handle).
+
+- drm-private-memory: <uint> [KiB|MiB]
+
+The total size of buffers that are not shared with another file.
+
+- drm-resident-memory: <uint> [KiB|MiB]
+
+The total size of buffers that are resident in system memory.
I think this naming maybe does not work best with the existing
drm-memory-<region> keys.
Actually, it was very deliberate not to conflict with the existing
drm-memory-<region> keys ;-)
I wouldn't have preferred drm-memory-{active,resident,...} but it
could be mis-parsed by existing userspace so my hands were a bit tied.
How about introduce the concept of a memory region from the start and
use naming similar like we do for engines?
drm-memory-$CATEGORY-$REGION: ...
Then we document a bunch of categories and their semantics, for instance:
'size' - All reachable objects
'shared' - Subset of 'size' with handle_count > 1
'resident' - Objects with backing store
'active' - Objects in use, subset of resident
'purgeable' - Or inactive? Subset of resident.
We keep the same semantics as with process memory accounting (if I got
it right) which could be desirable for a simplified mental model.
(AMD needs to remind me of their 'drm-memory-...' keys semantics. If we
correctly captured this in the first round it should be equivalent to
'resident' above. In any case we can document no category is equal to
which category, and at most one of the two must be output.)
Region names we at most partially standardize. Like we could say
'system' is to be used where backing store is system RAM and others are
driver defined.
Then discrete GPUs could emit N sets of key-values, one for each memory
region they support.
I think this all also works for objects which can be migrated between
memory regions. 'Size' accounts them against all regions while for
'resident' they only appear in the region of their current placement, etc.
I'm not too sure how to rectify different memory regions with this,
since drm core doesn't really know about the driver's memory regions.
Perhaps we can go back to this being a helper and drivers with vram
just don't use the helper? Or??
I think if you flip it around to drm-$CATEGORY-memory{-$REGION}: then it
all works out reasonably consistently?
That is basically what we have now. I could append -system to each to
make things easier to add vram/etc (from a uabi standpoint)..
What you have isn't really -system, but everything. So doesn't really make
sense to me to mark this -system, it's only really true for integrated (if
they don't have stolen or something like that).
Also my comment was more in reply to Tvrtko's suggestion.
Right so my proposal was drm-memory-$CATEGORY-$REGION which I think aligns
with the current drm-memory-$REGION by extending, rather than creating
confusion with different order of key name components.
Oh my comment was pretty much just bikeshed, in case someone creates a
$REGION that other drivers use for $CATEGORY. Kinda Rob's parsing point.
So $CATEGORY before the -memory.
Otoh I don't think that'll happen, so I guess we can go with whatever more
folks like :-) I don't really care much personally.
AMD currently has (among others) drm-memory-vram, which we could define in
the spec maps to category X, if category component is not present.
Some examples:
drm-memory-resident-system:
drm-memory-size-lmem0:
drm-memory-active-vram:
Etc.. I think it creates a consistent story.
Other than this, my two I think significant opens which haven't been
addressed yet are:
1)
Why do we want totals (not per region) when userspace can trivially
aggregate if they want. What is the use case?
2)
Current proposal limits the value to whole objects and fixates that by
having it in the common code. If/when some driver is able to support sub-BO
granularity they will need to opt out of the common printer at which point
it may be less churn to start with a helper rather than mid-layer. Or maybe
some drivers already support this, I don't know. Given how important VM BIND
is I wouldn't be surprised.
I feel like for drivers using ttm we want a ttm helper which takes care of
the region printing in hopefully a standard way. And that could then also
take care of all kinds of of partial binding and funny rules (like maybe
we want a standard vram region that addds up all the lmem regions on
intel, so that all dgpu have a common vram bucket that generic tools
understand?).
It does mean we walk the bo list twice, but *shrug*. People have been
complaining about procutils for decades, they're still horrible, I think
walking bo lists twice internally in the ttm case is going to be ok. If
not, it's internals, we can change them again.
Also I'd lean a lot more towards making ttm a helper and not putting that
into core, exactly because it's pretty clear we'll need more flexibility
when it comes to accurate stats for multi-region drivers.
But for a first "how much gpu space does this app use" across everything I
think this is a good enough starting point.