RE: [RESEND PATCH 1/1] mm: vmstat: Add OOM victims count in vmstat counter

From: David Rientjes
Date: Wed Oct 14 2015 - 18:05:15 EST


On Wed, 14 Oct 2015, PINTU KUMAR wrote:

> For me it was very helpful during sluggish and long duration ageing tests.
> With this, I don't have to look into the logs manually.
> I just monitor this count in a script.
> The moment I get nr_oom_victims > 1, I know that kernel OOM would have happened
> and I need to take the log dump.
> So, then I do: dmesg >> oom_logs.txt
> Or, even stop the tests for further tuning.
>

I think eventfd(2) was created for that purpose, to avoid the constant
polling that you would have to do to check nr_oom_victims and then take a
snapshot.

> > I disagree with this one, because we can encounter oom kills due to
> > fragmentation rather than low memory conditions for high-order allocations.
> > The amount of free memory may be substantially higher than all zone
> > watermarks.
> >
> AFAIK, kernel oom happens only for lower-order (PAGE_ALLOC_COSTLY_ORDER).
> For higher-order we get page allocation failure.
>

Order-3 is included. I've seen machines with _gigabytes_ of free memory
in ZONE_NORMAL on a node and have an order-3 page allocation failure that
called the oom killer.

> > We've long had a desire to have a better oom reporting mechanism rather than
> > just the kernel log. It seems like you're feeling the same pain. I think it
> would be
> > better to have an eventfd notifier for system oom conditions so we can track
> > kernel oom kills (and conditions) in userspace. I have a patch for that, and
> it
> > works quite well when userspace is mlocked with a buffer in memory.
> >
> Ok, this would be interesting.
> Can you point me to the patches?
> I will quickly check if it is useful for us.
>

https://lwn.net/Articles/589404. It's invasive and isn't upstream. I
would like to restructure that patchset to avoid the memcg trickery and
allow for a root-only eventfd(2) notification through procfs on system
oom.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/