On Wed, Dec 4, 2024 at 11:49 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
On Wed, 4 Dec 2024 19:09:40 +0800 Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote:
But this is not enough to free the empty PTE page table pages in paths other
that munmap and exit_mmap path, because IPI cannot be synchronized with
rcu_read_lock() in pte_offset_map{_lock}(). So we should let single table also
be freed by RCU like batch table freeing.
As a first step, we supported this feature on x86_64 and selectd the newly
introduced CONFIG_ARCH_SUPPORTS_PT_RECLAIM.
For other cases such as madvise(MADV_FREE), consider scanning and freeing empty
PTE pages asynchronously in the future.
Handling MADV_FREE sounds fairly straightforward?
AFAIU MADV_FREE usually doesn't immediately clear PTEs (except if they
are swap/hwpoison/... PTEs). So the easy thing to do would be to check
whether the page table has become empty within madvise(), but I think
the most likely case would be that PTEs still remain (and will be
asynchronously zapped later when memory pressure causes reclaim, or
something like that).
So I don't see an easy path to doing it for MADV_FREE.