Re: [PATCH 00/10] OOM Debug print selection and additional information

From: Tetsuo Handa
Date: Thu Aug 29 2019 - 10:09:17 EST


On 2019/08/29 20:56, Michal Hocko wrote:
>> But please be aware that, I REPEAT AGAIN, I don't think neither eBPF nor
>> SystemTap will be suitable for dumping OOM information. OOM situation means
>> that even single page fault event cannot complete, and temporary memory
>> allocation for reading from kernel or writing to files cannot complete.
>
> And I repeat that no such reporting is going to write to files. This is
> an OOM path afterall.

The process who fetches from e.g. eBPF event cannot involve page fault.
The front-end for iovisor/bcc is a python userspace process. But I think
that such process can't run under OOM situation.

>
>> Therefore, we will need to hold all information in kernel memory (without
>> allocating any memory when OOM event happened). Dynamic hooks could hold
>> a few lines of output, but not all lines we want. The only possible buffer
>> which is preallocated and large enough would be printk()'s buffer. Thus,
>> I believe that we will have to use printk() in order to dump OOM information.
>> At that point,
>
> Yes, this is what I've had in mind.

Probably I incorrectly shortcut.

Dynamic hooks could hold a few lines of output, but dynamic hooks can not hold
all lines when dump_tasks() reports 32000+ processes. We have to buffer all output
in kernel memory because we can't complete even a page fault event triggered by
the python process monitoring eBPF event (and writing the result to some log file
or something) while out_of_memory() is in flight.

And "set /proc/sys/vm/oom_dump_tasks to 0" is not the right reaction. What I'm
saying is "we won't be able to hold output from dump_tasks() if output from
dump_tasks() goes to buffer preallocated for dynamic hooks". We have to find
a way that can handle the worst case.