Re: [PATCH v2] mm/swap: fix swap_info_struct race between swapoff and get_swap_pages()

From: Rongwei Wang
Date: Thu Apr 06 2023 - 08:55:25 EST


Oh, sorry, I miss this email just now, that because of I'm also replying your previous email.

On 2023/4/6 20:12, Aaron Lu wrote:
On Wed, Apr 05, 2023 at 12:08:47AM +0800, Rongwei Wang wrote:
Hello

I have fix up some stuff base on Patch v1. And in order to help all readers
and reviewers to

reproduce this bug, share a reproducer here:
I reproduced this problem under a VM this way:

$ sudo ./stress-ng --swap 1
// on another terminal
$ for i in `seq 8`; do ./swap & done
Looks simpler than yours :-)
Cool, indeed become simpler.
(Didn't realize you have posted your reproducer here since I'm not CCed
and just found it after invented mine)
Then the warning message normally appear within a few seconds.

Here is the code for the above swap prog:

#include <stdio.h>
#include <stddef.h>
#include <sys/mman.h>

#define SIZE 0x100000

int main(void)
{
int i, ret;
void *p;

p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0);
if (p == MAP_FAILED) {
perror("mmap");
return -1;
}

ret = 0;
while (1) {
for (i = 0; i < SIZE; i += 0x1000)
((char *)p)[i] = 1;
ret = madvise(p, SIZE, MADV_PAGEOUT);
if (ret != 0) {
perror("madvise");
break;
}
}

return ret;
}

Unfortunately, this test prog did not work on kernels before v5.4 because
MADV_PAGEOUT is introduced in v5.4. I tested on v5.4 and the problem is
also there.

Maybe that is this bug can not be found since now. And we found this is triggered by stress-ng-swap and stress-ng-madvise (PAGEOUT) firstly. It seems this is that reason.

It seems MADV_COLD is also introduced together with MADV_PAGEOUT. I have no idea and have to depend on you.:-)


Haven't found a way to trigger swap with swap device come and go on
kernels before v5.4; tried putting the test prog in a memcg with memory
limit but then the prog is easily killed due to nowhere to swap out.

Personally, I do not intend to continuing searching for the method to reproduce before v5.4. Of course, if you have idea, I can try.


Thanks:-)