Re: Can't we use timeout based OOM warning/killing?

From: Tetsuo Handa
Date: Tue Oct 06 2015 - 10:51:56 EST


Tetsuo Handa wrote:
> Sorry. This was my misunderstanding. But I still think that we need to be
> prepared for cases where zapping OOM victim's mm approach fails.
> ( http://lkml.kernel.org/r/201509242050.EHE95837.FVFOOtMQHLJOFS@xxxxxxxxxxxxxxxxxxx )

I tested whether it is easy/difficult to make zapping OOM victim's mm
approach fail. The result seems that not difficult to make it fail.

---------- Reproducer start ----------
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sched.h>
#include <sys/mman.h>

static int reader(void *unused)
{
char c;
int fd = open("/proc/self/cmdline", O_RDONLY);
while (pread(fd, &c, 1, 0) == 1);
return 0;
}

static int writer(void *unused)
{
const int fd = open("/proc/self/exe", O_RDONLY);
static void *ptr[10000];
int i;
sleep(2);
while (1) {
for (i = 0; i < 10000; i++)
ptr[i] = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd,
0);
for (i = 0; i < 10000; i++)
munmap(ptr[i], 4096);
}
return 0;
}

int main(int argc, char *argv[])
{
int zero_fd = open("/dev/zero", O_RDONLY);
char *buf = NULL;
unsigned long size = 0;
int i;
for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
char *cp = realloc(buf, size);
if (!cp) {
size >>= 1;
break;
}
buf = cp;
}
for (i = 0; i < 100; i++) {
clone(reader, malloc(1024) + 1024, CLONE_THREAD | CLONE_SIGHAND | CLONE_VM,
NULL);
}
clone(writer, malloc(1024) + 1024, CLONE_THREAD | CLONE_SIGHAND | CLONE_VM, NULL);
read(zero_fd, buf, size); /* Will cause OOM due to overcommit */
return * (char *) NULL; /* Kill all threads. */
}
---------- Reproducer end ----------

(I wrote this program for trying to mimic a trouble that a customer's system
hung up with a lot of ps processes blocked at reading /proc/pid/ entries
due to unkillable down_read(&mm->mmap_sem) in __access_remote_vm(). Though
I couldn't identify what function was holding the mmap_sem for writing...)

Uptime > 429 of http://I-love.SAKURA.ne.jp/tmp/serial-20151006.txt.xz showed
a OOM livelock that

(1) thread group leader is blocked at down_read(&mm->mmap_sem) in exit_mm()
called from do_exit().

(2) writer thread is blocked at down_write(&mm->mmap_sem) in vm_mmap_pgoff()
called from SyS_mmap_pgoff() called from SyS_mmap().

(3) many reader threads are blocking the writer thread because of
down_read(&mm->mmap_sem) called from proc_pid_cmdline_read().

(4) while the thread group leader is blocked at down_read(&mm->mmap_sem),
some of the reader threads are trying to allocate memory via page fault.

So, zapping the first OOM victim's mm might fail by chance.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/