Re: [PATCH] dmabuf: Add the capability to expose DMA-BUF stats in sysfs
From: Hridya Valsaraju
Date: Thu Dec 10 2020 - 18:00:08 EST
Thanks again for the reviews!
On Thu, Dec 10, 2020 at 3:03 AM Christian König
<christian.koenig@xxxxxxx> wrote:
>
> Am 10.12.20 um 11:56 schrieb Greg KH:
> > On Thu, Dec 10, 2020 at 11:27:27AM +0100, Daniel Vetter wrote:
> >> On Thu, Dec 10, 2020 at 11:10:45AM +0100, Greg KH wrote:
> >>> On Thu, Dec 10, 2020 at 10:58:50AM +0100, Christian König wrote:
> >>>> In general a good idea, but I have a few concern/comments here.
> >>>>
> >>>> Am 10.12.20 um 05:43 schrieb Hridya Valsaraju:
> >>>>> This patch allows statistics to be enabled for each DMA-BUF in
> >>>>> sysfs by enabling the config CONFIG_DMABUF_SYSFS_STATS.
> >>>>>
> >>>>> The following stats will be exposed by the interface:
> >>>>>
> >>>>> /sys/kernel/dmabuf/<inode_number>/exporter_name
> >>>>> /sys/kernel/dmabuf/<inode_number>/size
> >>>>> /sys/kernel/dmabuf/<inode_number>/dev_map_info
> >>>>>
> >>>>> The inode_number is unique for each DMA-BUF and was added earlier [1]
> >>>>> in order to allow userspace to track DMA-BUF usage across different
> >>>>> processes.
> >>>>>
> >>>>> Currently, this information is exposed in
> >>>>> /sys/kernel/debug/dma_buf/bufinfo.
> >>>>> However, since debugfs is considered unsafe to be mounted in production,
> >>>>> it is being duplicated in sysfs.
> >>>> Mhm, this makes it part of the UAPI. What is the justification for this?
> >>>>
> >>>> In other words do we really need those debug information in a production
> >>>> environment?
> >>> Production environments seem to want to know who is using up memory :)
> >> This only shows shared memory, so it does smell a lot like $specific_issue
> >> and we're designing a narrow solution for that and then have to carry it
> >> forever.
> > I think the "issue" is that this was a feature from ion that people
> > "missed" in the dmabuf move. Taking away the ability to see what kind
> > of allocations were being made didn't make a lot of debugging tools
> > happy :(
>
> Yeah, that is certainly a very valid concern.
>
> > But Hridya knows more, she's been dealing with the transition for a long
> > time now.
Currently, telemetry tools capture this information(along with other
memory metrics) periodically as well as on important events like a
foreground app kill (which might have been triggered by an LMK). We
would also like to get a snapshot of the system memory usage on other
events such as OOM kills and ANRs.
> >
> >> E.g. why is the list of attachments not a sysfs link? That's how we
> >> usually expose struct device * pointers in sysfs to userspace, not as a
> >> list of things.
> > These aren't struct devices, so I don't understand the objection here.
> > Where else could these go in sysfs?
>
> Sure they are! Just take a look at an attachment:
>
> struct dma_buf_attachment {
> struct dma_buf *dmabuf;
> struct device *dev;
>
> This is the struct device which is importing the buffer and the patch in
> discussion is just printing the name of this device into sysfs.
I actually did not know that this is not ok to do. I will change it in
the next version of the patch to be sysfs links instead.
>
> >> Furthermore we don't have the exporter device covered anywhere, how is
> >> that tracked? Yes Android just uses ion for all shared buffers, but that's
> >> not how all of linux userspace works.
> > Do we have the exporter device link in the dmabuf interface? If so,
> > great, let's use that, but for some reason I didn't think it was there.
>
> Correct, since we don't really need a device as an exporter (it can just
> be a system heap as well) we only have a const char* as name for the
> exporter.
Yes, the file exporter_name prints out this information.
>
> >> Then I guess there's the mmaps, you can fish them out of procfs. A tool
> >> which collects all that information might be useful, just as demonstration
> >> of how this is all supposed to be used.
> > There's a script somewhere that does this today, again, Hridya knows
> > more.
That is correct, we do have a tool in AOSP that gathers the
per-process DMA-BUF map stats from procfs.
https://android.googlesource.com/platform/system/memory/libmeminfo/+/refs/heads/master/libdmabufinfo/tools/dmabuf_dump.cpp
When I send the next revision of the patch, I will also include links
to AOSP CLs that show the usage for the sysfs files.
> >
> >> There's also some things to make sure we're at least having thought about
> >> how other things fit in here. E.d. dma_resv attached to the dma-buf
> >> matters in general a lot. It doesn't matter on Android because
> >> everything's pinned all the time anyway.
I see your point Daniel! I will make the interface extendable in the
next version of the patch.
> >>
> >> Also I thought sysfs was one value one file, dumping an entire list into
> >> dev_info_map with properties we'll need to extend (once you care about
> >> dma_resv you also want to know which attachments are dynamic) does not
> >> smell like sysfs design at all.
> > sysfs is one value per file, what is being exported that is larger than
> > that here? Did I miss something on review?
>
> See this chunk here:
>
> +
> + list_for_each_entry(attachment, &dmabuf->attachments, node) {
> + if (attachment->map_counter) {
> + ret += sysfs_emit_at(buf, ret, "%s ",
> + dev_name(attachment->dev));
> + }
> + }
>
> And yes now that Daniel mentioned that it looks like a sysfs rules
> violation to me as well.
Sysfs rules do seem to allow an array of similar values in one file
https://elixir.bootlin.com/linux/v5.10-rc7/source/Documentation/filesystems/sysfs.rst#L63
However, I agree that we should change it so that it can be expanded
easily in the future. I will fix it in the next version. Thank you all
for pointing it out!
Regards,
Hridya
>
> Regards,
> Christian.
>
>
> >
> > thanks,
> >
> > greg k-h
>