Re: Many unexplainable OOMs after upgrading to 4.7.x kernel

From: Markus Trippelsdorf
Date: Thu Aug 25 2016 - 11:58:30 EST


On 2016.08.25 at 17:40 +0200, David Madore wrote:
> TL;DR: Why is Firefox getting OOM-killed while I have 24GB free swap?
>
> A few days ago I upgraded the kernel on my desktop PC from 4.5.5 to
> 4.7.2 and, since then, I've witnessed a huge number of cases where
> various processes (typically Firefox) got OOM-killed by the kernel.
> Before this kernel upgrade, I had never seen a single OOM event in
> normal use; now I've had dozens in couple of days. Nothing has
> changed in my config apart from the kernel. Clearly, something has
> changed for the worse!
>
> In fact, this morning, the problem was so bad that I was simply unable
> to start Firefox (any attempt to do so would result in it getting
> killed immediately), even though I had about 24GB of free swap
> available (and the system was idle). I had to kill a few unrelated
> processes to be able to launch Firefox; and even then, at some point,
> running "sync" in a different window caused the Firefox to be
> OOM-killed.
>
> (Note: the problem IS NOT the choice of the process being killed:
> Firefox is a reasonable target. The problem is that the OOM-killer is
> being invoked at all, whereas there is plenty of free swap, and,
> before this kernel upgrade, all seemed to work perfectly.)
>
> Unfortunately, all of this is very unreproducible: now it seems to
> have disappeared completely, so I can't really test anything any more.
> All I can offer is some sample output (below) from the kernel's logs.
> Any suggestions as to what I should do if and when the problem returns
> is welcome, either to debug or to work around these OOMs.
>
> The computer in question is an x86_64 (Intel Core 2 Quad Q6600) box
> with 8GB RAM and 24GB swap (4*6GB swap across four different disks): I
> can of course offer any additional details as to its hardware, kernel
> or userland config.
>
> Sample log output from OOM-killer:
>
> ### cut after ###
> Aug 25 11:46:12 vega kernel: [115461.357412] firefox invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0

Welcome to the club.
See https://lkml.org/lkml/2016/8/22/184 for a possible fix.

--
Markus