Re: [PATCH v5] dma-buf: Add DmaBufTotal counter in meminfo

From: Peter.Enderborg
Date: Wed Apr 21 2021 - 13:36:50 EST


On 4/21/21 5:31 PM, Mike Rapoport wrote:
> On Wed, Apr 21, 2021 at 10:37:11AM +0000, Peter.Enderborg@xxxxxxxx wrote:
>> On 4/21/21 11:15 AM, Daniel Vetter wrote:
>>> We need to understand what the "correct" value is. Not in terms of kernel
>>> code, but in terms of semantics. Like if userspace allocates a GL texture,
>>> is this supposed to show up in your metric or not. Stuff like that.
>> That it like that would like to only one pointer type. You need to know what
>>
>> you pointing at to know what it is. it might be a hardware or a other pointer.
>>
>> If there is a limitation on your pointers it is a good metric to count them
>> even if you don't  know what they are. Same goes for dma-buf, they
>> are generic, but they consume some resources that are counted in pages.
>>
>> It would be very good if there a sub division where you could measure
>> all possible types separately.  We have the detailed in debugfs, but nothing
>> for the user. A summary in meminfo seems to be the best place for such
>> metric.
>
> Let me try to summarize my understanding of the problem, maybe it'll help
> others as well.

Thanks!


> A device driver allocates memory and exports this memory via dma-buf so
> that this memory will be accessible for userspace via a file descriptor.
>
> The allocated memory can be either allocated with alloc_page() from system
> RAM or by other means from dedicated VRAM (that is not managed by Linux mm)
> or even from on-device memory.
>
> The dma-buf driver tracks the amount of the memory it was requested to
> export and the size it sees is available at debugfs and fdinfo.
>
> The debugfs is not available to user and maybe entirely disabled in
> production systems.
>
> There could be quite a few open dma-bufs so there is no overall summary,
> plus fdinfo in production systems your refer to is also unavailable to the
> user because of selinux policy.
>
> And there are a few details that are not clear to me:
>
> * Since DRM device drivers seem to be the major user of dma-buf exports,
> why cannot we add information about their memory consumption to, say,
> /sys/class/graphics/drm/cardX/memory-usage?

Android is using it for binder that connect more or less everything
internally.

> * How exactly user generates reports that would include the new counters?
> From my (mostly outdated) experience Android users won't open a terminal
> and type 'cat /proc/meminfo' there. I'd presume there is a vendor agent
> that collects the data and sends it for analysis. In this case what is
> the reason the vendor is unable to adjust selinix policy so that the
> agent will be able to access fdinfo?

When you turn on developer mode on android you can use
usb with a program called adb. And there you get a normal shell.

(not root though)

There is applications that non developers can use to get
information. It is very limited though and there are API's
provide it.


>
> * And, as others already mentioned, it is not clear what are the problems
> that can be detected by examining DmaBufTotal except saying "oh, there is
> too much/too little memory exported via dma-buf". What would be user
> visible effects of these problems? What are the next steps to investigate
> them? What other data will be probably required to identify root cause?
>
When you debug thousands of devices it is quite nice to have
ways to classify what the problem it is not. The normal user does not
see anything of this. However they can generate bug-reports that
collect information about as much they can. Then the user have
to provide this bug-report to the manufacture or mostly the
application developer. And when the problem is
system related we need to reproduce the issue on a full
debug enabled unit.