[RFC PATCH] mm/madvise: enable files from read-only block for MADV_PAGEOUT
From: hailong
Date: Thu Dec 26 2024 - 08:23:23 EST
From: Hailong Liu <hailong.liu@xxxxxxxx>
Apps may load files from a read-only block after startup and then
switch to the background. In this case, the system daemon reclaim page
cache using process_madvise(vmas, MADV_PAGEOUT). However, currently
only inode_owner_or_capable() or file_permission(vma->vm_file, MAY_WRITE) == 0
meet the conditions of can_do_file_pageout().
In fact, for read-only block devices, we can directly discard these
pages and free up memory.
The test results are as follows:
Before
Pss Private Private SwapPss Rss Heap Heap Heap
Total Dirty Clean Dirty Total Size Alloc Free
------ ------ ------ ------ ------ ------ ------ ------
.so mmap 199 0 0 128 27616
.jar mmap 748 0 88 0 39444
.apk mmap 6818 0 6188 0 8076
.dex mmap 102 0 80 44 1120
.oat mmap 148 0 0 0 11836
.art mmap 341 0 8 652 30748
Other mmap 42 0 0 4 2228
Unknown 5 0 0 1012 1528
TOTAL 17984 0 6372 9412 138096 31023 13628 17394
After
Pss Private Private SwapPss Rss Heap Heap Heap
Total Dirty Clean Dirty Total Size Alloc Free
------ ------ ------ ------ ------ ------ ------ ------
.so mmap 206 0 0 132 27332
.jar mmap 625 0 0 0 39288
.apk mmap 613 0 0 0 1668
.dex mmap 22 0 0 44 1040
.oat mmap 151 0 0 0 11836
.art mmap 340 0 0 636 30756
Other mmap 44 0 0 4 2248
Unknown 6 0 0 1004 1532
TOTAL 11801 0 8 9624 131336 28939 13485 15453
>From above we can see the *.apk mmap* is reclaimed.
Signed-off-by: Hailong Liu <hailong.liu@xxxxxxxx>
---
mm/madvise.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 8e5bf11af1b2..503ee5e03b7e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -337,12 +337,13 @@ static inline bool can_do_file_pageout(struct vm_area_struct *vma)
return false;
/*
* paging out pagecache only for non-anonymous mappings that correspond
- * to the files the calling process could (if tried) open for writing;
- * otherwise we'd be including shared non-exclusive mappings, which
- * opens a side channel.
+ * to the files the calling process could (if tried) open for writing or
+ * file from read-only super block; otherwise we'd be including
+ * shared non-exclusive mappings, which opens a side channel.
*/
return inode_owner_or_capable(&nop_mnt_idmap,
file_inode(vma->vm_file)) ||
+ sb_rdonly(file_inode(vma->vm_file)->i_sb) ||
file_permission(vma->vm_file, MAY_WRITE) == 0;
}
--
2.30.0