Re: [RFC PATCH v2 -mm] provide estimated available memory in/proc/meminfo

From: Andrew Morton
Date: Thu Nov 07 2013 - 17:27:51 EST

On Thu, 7 Nov 2013 16:21:32 -0500 Johannes Weiner <hannes@xxxxxxxxxxx> wrote:

> > Subject: provide estimated available memory in /proc/meminfo
> >
> > Many load balancing and workload placing programs check /proc/meminfo
> > to estimate how much free memory is available. They generally do this
> > by adding up "free" and "cached", which was fine ten years ago, but
> > is pretty much guaranteed to be wrong today.
> >
> > It is wrong because Cached includes memory that is not freeable as
> > page cache, for example shared memory segments, tmpfs, and ramfs,
> > and it does not include reclaimable slab memory, which can take up
> > a large fraction of system memory on mostly idle systems with lots
> > of files.
> >
> > Currently, the amount of memory that is available for a new workload,
> > without pushing the system into swap, can be estimated from MemFree,
> > Active(file), Inactive(file), and SReclaimable, as well as the "low"
> > watermarks from /proc/zoneinfo.
> >
> > However, this may change in the future, and user space really should
> > not be expected to know kernel internals to come up with an estimate
> > for the amount of free memory.
> >
> > It is more convenient to provide such an estimate in /proc/meminfo.
> > If things change in the future, we only have to change it in one place.
> >
> > Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
> > Reported-by: Erik Mouw <erik.mouw_2@xxxxxxx>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> I have a suspicion that people will end up relying on this number to
> start new workloads in situations where lots of the page cache is
> actually heavily used. We might not swap, but there will still be IO
> from thrashing cache.
> Maybe we'll have to subtract mapped cache pages in the future to
> mitigate this risk somehow...
> Anyway, we can defer this to when it's proven to be an actual problem.

Well not really. Once we release this thing with a particular
implementation, we are constrained in making any later changes. If we
change it to produce larger numbers, someone's workload will start
swapping. If we change it to produce smaller numbers, someone's
workload will refuse to start.

It all needs a bit of thought, and even some testing! I labelled this
one for-3.14.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at