Re: [BUG?] OOM with large cache....(x86_64, 2.6.24-rc3-git1, nohz)

From: Nick Piggin
Date: Tue Nov 20 2007 - 03:59:47 EST


On Tuesday 20 November 2007 18:26, Nick Piggin wrote:
> On Tuesday 20 November 2007 16:46, Ingo Molnar wrote:
> > * Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
> > > Unfortunately, we don't show NR_ANON_PAGES in these stats, [...]
> >
> > sidenote: the way i combat these missing pieces of instrumentation in
> > the scheduler is to add them immediately to the cfs-debug-info.sh script
> > (and to /proc/sched_debug if needed). I.e. if we get one report that
> > misses a piece of critical information is OK, but if it's two reports
> > and we still havent made it easy to report the right kind of information
> > that is our fault entirely. This constant ping-ponging for information
> > that goes on for basically every MM problem - which information could
> > have been provided in the first message (by running a single, easy to
> > download tool) is getting pretty hindering i believe.
>
> I do usually to add the stats as I've needed them. I haven't
> specifically needed NR_ANON_PAGES for an oom-killer problem
> before, but I've added plenty of other output there.
>
> (it's in /proc/meminfo of course, which is the most useful...)

BTW. I guess one reason why this stat isn't in the OOM output is
that it probably isn't a kernel bug (but a userspace leak). It's
relatively rare to have a kernel leak in the pagecache or anon
memory so it usually shows up in slab.

Actually I think the oom killer output is a bit too verbose... no
not so much verbose, but it doesn't present the information very
kindly to administrators. IMO, it could be presented better to
non kernel hackers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/