Re: [PATCH] trace: Set oom_score_adj to maximum for ring bufferallocating process

From: David Rientjes
Date: Fri May 27 2011 - 19:22:33 EST


On Fri, 27 May 2011, Vaibhav Nagarnaik wrote:

> The tracing ring buffer is allocated from kernel memory. While
> allocating the memory, if OOM happens, the allocating process might not
> be the one that gets killed, since the ring-buffer memory is not
> allocated as process memory. Thus random processes might get killed
> during the allocation.
>
> This patch makes sure that the allocating process is considered the most
> likely oom-kill-able process while the allocating is going on. Thus if
> oom-killer is invoked because of ring-buffer allocation, it is easier
> for the ring buffer memory to be freed and save important system
> processes from being killed.
>
> This patch also adds __GFP_NORETRY flag to the ring buffer allocation
> calls to make it fail more gracefully if the system will not be able to
> complete the allocation request.
>
> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@xxxxxxxxxx>

Still not sure this is what we want, I'm afraid.

I like the addition of __GFP_NORETRY, but I don't understand the use of
test_set_oom_score_adj() here. Why can't we use oom_killer_disable(),
allocate with __GFP_NORETRY, and then do oom_killer_enable()?

This prevents other tasks from getting oom killed themselves if they have
oom_score_adj of OOM_SCORE_ADJ_MAX and allows the write to fail with
-ENOMEM rather then being oom killed out from under us.

So why is test_set_oom_score_adj() better?

The alternative would be to setup an oom notifier for the ring buffer and
stop allocating prior to killing a task and return a size that was smaller
than what the user requested.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/