Re: [PATCH v4 0/9] mm: workingset reporting
From: SeongJae Park
Date: Wed Nov 27 2024 - 14:40:51 EST
+ damon@xxxxxxxxxxxxxxx
I haven't thoroughly read any version of this patch series due to my laziness,
sorry. So I may saying something completely wrong. My apology in advance, and
please correct me in the case.
> On Tue, Nov 26, 2024 at 06:57:19PM -0800, Yuanchu Xie wrote:
> > This patch series provides workingset reporting of user pages in
> > lruvecs, of which coldness can be tracked by accessed bits and fd
> > references.
DAMON provides data access patterns of user pages. It is not exactly named as
workingset but a superset of the information. Users can therefore get the
workingset from DAMON-provided raw data. So I feel I have to ask if DAMON can
be used for, or help at achieving the purpose of this patch series.
Depending on the detailed definition of workingset, of course, the workingset
we can get from DAMON might not be technically same to what this patch series
aim to provide, and the difference could be somewhat that makes DAMON unable to
be used or help here. But I cannot know if this is the case with only this
cover letter.
> > However, the concept of workingset applies generically to
> > all types of memory, which could be kernel slab caches, discardable
> > userspace caches (databases), or CXL.mem. Therefore, data sources might
> > come from slab shrinkers, device drivers, or the userspace.
> > Another interesting idea might be hugepage workingset, so that we can
> > measure the proportion of hugepages backing cold memory. However, with
> > architectures like arm, there may be too many hugepage sizes leading to
> > a combinatorial explosion when exporting stats to the userspace.
> > Nonetheless, the kernel should provide a set of workingset interfaces
> > that is generic enough to accommodate the various use cases, and extensible
> > to potential future use cases.
This again sounds similar to what DAMON aims to provide, to me. DAMON is
designed to be easy to extend for vairous use cases and internal mechanisms.
Specifically, it separates access check mechanisms and core logic into
different layers, and provides an interface to use for implementing extending
DAMON with new mechanisms. DAMON's two access check mechanisms for virtual
address spaces and the physical address space are made using the interface,
indeed. Also there were RFC patch series extending DAMON for NUMA-specific and
write-only access monitoring using NUMA hinting fault and soft-dirty PTEs as
the internal mechanisms.
My humble understanding of the major difference between DAMON and workingset
reporting is the internal mechanism. Workingset reporting uses MGLRU as the
access check mechanism, while current access check mechanisms for DAMON are
using page table accessed bits checking as the major mechanism. I think DAMON
can be extended to use MGLRU as its another internal access check mechanism,
but I understand that there could be many things that I overseeing.
Yuanchu, I think it would help me and other reviewers better understand this
patch series if you could share that. And I will also be more than happy to
help you and others better understanding what DAMON can do or not with the
discussion.
>
> Doesn't DAMON already provide this information?
>
> CCing SJ.
Thank you for adding me, Johannes :)
[...]
> It does provide more detailed insight into userspace memory behavior,
> which could be helpful when trying to make sense of applications that
> sit on a rich layer of libraries and complicated runtimes. But here a
> comparison to DAMON would be helpful.
100% agree.
Thanks,
SJ
[...]