Re: [RFC PATCH v2 0/3] Add NUMA-aware DAMOS watermarks

From: Jonghyeon Kim
Date: Fri May 24 2024 - 01:24:46 EST


On Tue, May 21, 2024 at 06:00:34PM -0700, SeongJae Park wrote:
> Hi Jonghyeon,
>

Hi, SeongJae

> On Mon, 20 May 2024 14:30:35 +0000 Jonghyeon Kim <tome01@xxxxxxxxxx> wrote:
>
> > Current DAMOS schemes are not considered with multiple NUMA memory nodes.
> > For example, If we want to proactively reclaim memory of a one NUMA node,
> > DAMON_RECLAIM has to wake up kdamond before kswapd does reclaim memory.
> > However, since the DAMON watermarks are based on not a one NUMA memory
> > node but total system free memory, kdamond is not waked up before invoking
> > memory reclamation from kswapd of the target node.
> >
> > These patches allow for DAMON to select monitoring target either total
> > memory or a specific NUMA memory node.
>
> I feel such usage could exist, but my humble brain is not clearly imagining
> such realistic usage. If you could further clarify the exampected usage, it
> would be very helpful for me to better understand the intention and pros/cons
> of this patchset. Especially, I'm wondering why they would want to use the
> watermark feature, rather than manually checking the metric and turning DAMON
> on/off, or feeding the metric as a quota tuning goal.
>

The goal of this patchset is to manage each NUMA memory node
individually through DAMON. Also, the main target scheme is memory
reclaim (or demotion in tiered memory). By allowing DAMON to be managed
by each NUMA node, I expect that users can easily set up memory reclaim
for each node.

Additionally, I think that a watermark for each node is an appropriate
metric for activating DAMON_RECLAIM, because the kswapd reclaim logic
also follows a watermark of free memory for each node.

There are two use cases. Let's assume two NUMA nodes are constructed of
32GB (node0) and 16GB (node1), respectively.

The first case is when using DAMON module. If users do not specify a
monitoring region, DAMON's module finds the biggest size of the two NUMA
memory nodes and designates it as the monitoring region (node0, 32GB).
Even if we want to enable DAMON_RECLAIM to node0, it does not work
proactively because the watermark works based on the total system memory
(48GB).

Similarly, if the users want to enable DAMON_RECLAIM to node1, users have
to manually designate the monitoring region as the address of node1.
Nonetheless, since DAMON still follows the default watermark
(total memory, 48GB), proactive reclaim will not work properly.

Below is an example.

# echo Y > /sys/module/damon_reclaim/parameters/enabled
# cat /sys/module/damon_reclaim/parameters/monitor_region_start
4294967296 # 0x100000000
# cat /sys/module/damon_reclaim/parameters/monitor_region_end
36507222015 # 0x87fffffff

# dmesg | grep node
..
[0.012812] Early memory node ranges
[0.012813] node 0: [mem 0x0000000000001000-0x000000000009ffff]
[0.012815] node 0: [mem 0x0000000000100000-0x000000005e22dfff]
[0.012817] node 0: [mem 0x000000005e62c000-0x0000000069835fff]
[0.012818] node 0: [mem 0x000000006f2d3000-0x000000006f7fffff]
[0.012819] node 0: [mem 0x0000000100000000-0x000000087fffffff] < target
[0.012825] node 1: [mem 0x0000002800000000-0x0000002bffffffff]
..

When we use DAMON_RECLAIM by default, DAMON_RECLAIM targets node0
memory (32GB). However, DAMON runs differently from the initial goal
because the watermark works based on the combined node0 and node1(48GB).
DAMON_LRU_SORT also faces the same situation.

The second case is when we apply DAMON to a process. If a process
allocates memory that exceeds a single NUMA node(node0), some users
could want to reclaim the cold memory of the process in that node. In my
humble opinion, the reclaim scheme(DAMOS_PAGEOUT) is effective in this
case. Unlike the DAMON module, since DAMON monitors process memory
using a virtual address, it is hard to decide whether to enable a
DAMOS_PAGEOUT due to a lack of node memory stats. Even though we use
watermarks for DAMOS_PAGEOUT, it works the same with the above module
case (thresholds based on total memory, 48GB). To overcome this problem,
I think the dedicated watermark (for node0) can be an answer.

> >
> > ---
> > Changes from RFC PATCH v1
> > (https://lore.kernel.org/all/20220218102611.31895-1-tome01@xxxxxxxxxx)
> > - Add new metric type for NUMA node, DAMOS_WMARK_NODE_FREE_MEM_RATE
> > - Drop commit about damon_start()
> > - Support DAMON_LRU_SORT
> >
> > Jonghyeon Kim (3):
> > mm/damon: Add new metric type and target node for watermark
> > mm/damon: add module parameters for NUMA system
> > mm/damon: add NUMA-awareness to DAMON modules
>
> Following up to the above question, why they would want to use DAMON modules
> rather than manually controlling DAMON via DAMON sysfs interface?

IMHO, some users want to use DAMON feature without mannualy
configurating via DAMON sysfs due to complexity. Since this patchset
can be adopted to sysfs interface, I will update supporting NUMA-aware
watermarks for sysfs interface in the next version.

Best Regards,
Jonghyeon

>
>
> Thanks,
> SJ
>
> >
> > include/linux/damon.h | 11 +++++++++--
> > mm/damon/core.c | 11 ++++++++---
> > mm/damon/lru_sort.c | 14 ++++++++++++++
> > mm/damon/modules-common.h | 4 +++-
> > mm/damon/reclaim.c | 14 ++++++++++++++
> > mm/damon/sysfs-schemes.c | 35 +++++++++++++++++++++++++++++++++--
> > 6 files changed, 81 insertions(+), 8 deletions(-)
> >
> > --
> > 2.34.1