Re: [RFC PATCH v1 0/2] mm: multi-gen LRU scanning for page promotion
From: Yuanchu Xie
Date: Tue Mar 25 2025 - 17:56:27 EST
On Tue, Mar 25, 2025 at 4:56 AM Bharata B Rao <bharata@xxxxxxx> wrote:
>
> Thanks for your patchset. I haven't looked at the patches in detail yet,
> but gave it a quick try with the micro-benchmark that I have been using.
Thanks for running the numbers. Unfortunately neither of us can attend
LSF/MM in person, but we're excited about this opportunity for
collaboration.
>
> The below numbers can be compared with the base numbers that I have
> posted here
> (https://lore.kernel.org/linux-mm/20250325081832.209140-1-bharata@xxxxxxx/).
> Test 2 in the above link is the one I tried with this patchset.
>
> kernel.numa_balancing = 0
> demotion=true
> cpufreq governor=performance
>
> Benchmark run configuration:
> Compute-node = 1
> Memory-node = 2
> Memory-size = 206158430208
> Hot-region-size = 1073741824
> Nr-hot-regions = 192
> Access pattern = random
> Access granularity = 4096
> Delay b/n accesses = 0
> Load/store ratio = 50l50s
> THP used = no
> Nr accesses = 25769803776
> Nr repetitions = 512
>
> Benchmark completed in 605983205.0 us
The benchmark does seem to complete in less time, but I'm not sure why
especially given the small number of pages promoted. I think it would
also be useful to see the usage breakdown of DRAM/CXL over time.
>
> numa_hit 63621437
> numa_miss 2721737
> numa_foreign 2721737
> numa_interleave 0
> numa_local 48243292
> numa_other 18099882
> pgpromote_success 0
> pgpromote_candidate 0
> pgdemote_kswapd 15409682
> pgdemote_direct 0
> pgdemote_khugepaged 0
> numa_pte_updates 0
> numa_huge_pte_updates 0
> numa_hint_faults 0
> numa_hint_faults_local 0
> numa_pages_migrated 19596
> pgmigrate_success 15429278
> pgmigrate_fail 256
>
> kpromoted_recorded_accesses 27647687
> kpromoted_recorded_hwhints 0
> kpromoted_recorded_pgtscans 27647687
> kpromoted_record_toptier 0
Makes sense, we skip toptier scanning
> kpromoted_record_added 17184209
> kpromoted_record_exists 10463478
> kpromoted_mig_right_node 0
> kpromoted_mig_non_lru 404308
> kpromoted_mig_cold_old 6417567
> kpromoted_mig_cold_not_accessed 10342825
> kpromoted_mig_promoted 19509
Compared to 611077 (IBS number) this is a lot lower.
> kpromoted_mig_dropped 17164700
>
> When I try to get the same benchmark numbers for kpromoted driven by
> kmmscand, kpromoted gets overwhelmed with the amount of data that
> kmmdscand provides while no such issues with the amount of accesses
> reported by this patchset.
The scan interval in this series is 4 seconds, while the kmmscand's
pause between scanning is 16ms. So there're definitely some gaps here.
The MGLRU page table walk also has a bunch of optimizations, and some
of them are more focused on reclaim, so we might need to tweak some
things there too.
Yuanchu