[Some people who received this message don't often get email from 21cnbao@xxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]If folio_test_anon(folio) && folio_test_swapbacked(folio) condition is used, can
On Wed, Jul 10, 2024 at 4:04 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
On 10.07.24 06:02, Barry Song wrote:You're correct. I overlooked this aspect, focusing on swap and thinking of shmem
On Wed, Jul 10, 2024 at 3:59 PM David Hildenbrand <david@xxxxxxxxxx> wrote:But they won't get necessarily *freed* when unmapping them. They might
On 10.07.24 05:32, Barry Song wrote:my point is that the purpose is skipping redundant swap-out, if shmem is single
On Wed, Jul 10, 2024 at 9:23 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:BTW, we dropped the folio_test_anon() check, but what about shmem? They
On Tue, 9 Jul 2024 20:31:15 +0800 Zhiguo Jiang <justinjiang@xxxxxxxx> wrote:Hi Andrew,
The releasing process of the non-shared anonymous folio mapped solely byIt would be helpful to provide some before-and-after runtime
an exiting process may go through two flows: 1) the anonymous folio is
firstly is swaped-out into swapspace and transformed into a swp_entry
in shrink_folio_list; 2) then the swp_entry is released in the process
exiting flow. This will result in the high cpu load of releasing a
non-shared anonymous folio mapped solely by an exiting process.
When the low system memory and the exiting process exist at the same
time, it will be likely to happen, because the non-shared anonymous
folio mapped solely by an exiting process may be reclaimed by
shrink_folio_list.
This patch is that shrink skips the non-shared anonymous folio solely
mapped by an exting process and this folio is only released directly in
the process exiting flow, which will save swap-out time and alleviate
the load of the process exiting.
measurements, please. It's a performance optimization so please let's
see what effect it has.
This was something I was curious about too, so I created a small test program
that allocates and continuously writes to 256MB of memory. Using QEMU, I set
up a small machine with only 300MB of RAM to trigger kswapd.
qemu-system-aarch64 -M virt,gic-version=3,mte=off -nographic \
-smp cpus=4 -cpu max \
-m 300M -kernel arch/arm64/boot/Image
The test program will be randomly terminated by its subprocess to trigger
the use case of this patch.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <signal.h>
#define MEMORY_SIZE (256 * 1024 * 1024)
unsigned char *memory;
void allocate_and_write_memory()
{
memory = (unsigned char *)malloc(MEMORY_SIZE);
if (memory == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}
while (1)
memset(memory, 0x11, MEMORY_SIZE);
}
int main()
{
pid_t pid;
srand(time(NULL));
pid = fork();
if (pid < 0) {
perror("fork");
exit(EXIT_FAILURE);
}
if (pid == 0) {
int delay = (rand() % 10000) + 10000;
usleep(delay * 1000);
/* kill parent when it is busy on swapping */
kill(getppid(), SIGKILL);
_exit(0);
} else {
allocate_and_write_memory();
wait(NULL);
free(memory);
}
return 0;
}
I tracked the number of folios that could be redundantly
swapped out by adding a simple counter as shown below:
@@ -879,6 +880,9 @@ static bool folio_referenced_one(struct folio *folio,
check_stable_address_space(vma->vm_mm)) &&
folio_test_swapbacked(folio) &&
!folio_likely_mapped_shared(folio)) {
+ static long i, size;
+ size += folio_size(folio);
+ pr_err("index: %d skipped folio:%lx total size:%d\n", i++, (unsigned long)folio, size);
pra->referenced = -1;
page_vma_mapped_walk_done(&pvmw);
return false;
This is what I have observed:
/ # /home/barry/develop/linux/skip_swap_out_test
[ 82.925645] index: 0 skipped folio:fffffdffc0425400 total size:65536
[ 82.925960] index: 1 skipped folio:fffffdffc0425800 total size:131072
[ 82.927524] index: 2 skipped folio:fffffdffc0425c00 total size:196608
[ 82.928649] index: 3 skipped folio:fffffdffc0426000 total size:262144
[ 82.929383] index: 4 skipped folio:fffffdffc0426400 total size:327680
[ 82.929995] index: 5 skipped folio:fffffdffc0426800 total size:393216
...
[ 88.469130] index: 6112 skipped folio:fffffdffc0390080 total size:97230848
[ 88.469966] index: 6113 skipped folio:fffffdffc038d000 total size:97296384
[ 89.023414] index: 6114 skipped folio:fffffdffc0366cc0 total size:97300480
I observed that this patch effectively skipped 6114 folios (either 4KB or 64KB
mTHP), potentially reducing the swap-out by up to 92MB (97,300,480 bytes) during
the process exit.
Despite the numerous mistakes Zhiguo made in sending this patch, it is still
quite valuable. Please consider pulling his v9 into the mm tree for testing.
also do __folio_set_swapbacked()?
mapped, they could be also skipped.
just continue living in tmpfs? where some other process might just map
them later?
solely in terms of swap.
IMHO, there is a big difference here between anon and shmem. (well,Even though anon_shmem behaves similarly to anonymous memory when
anon_shmem would actually be different :) )
releasing memory, it doesn't seem worth the added complexity?
So unfortunately it seems Zhiguo still needs v10 to take folio_test_anon()
back? Sorry for my bad, Zhiguo.
Thanks
--Thanks
Cheers,
David / dhildenb
Barry