Re: [RFC PATCH 0/10] split anon and file LRUs

From: Rik van Riel
Date: Tue Nov 06 2007 - 21:23:19 EST


On Tue, 6 Nov 2007 18:11:39 -0800 (PST)
Christoph Lameter <clameter@xxxxxxx> wrote:

> On Sat, 3 Nov 2007, Rik van Riel wrote:
>
> > The current version only has the infrastructure. Large changes to
> > the page replacement policy will follow later.
>
> Hmmmm.. I'd rather see where we are going.

http://linux-mm.org/PageReplacementDesign

> One other way of addressing many of these issues is to allow large page sizes
> on the LRU which will reduce the number of entities that have to be managed.

Linus seems to have vetoed that (unless I am mistaken), so the
chances of that happening soon are probably not very large.

Also, a factor 16 increase in page size is not going to help
if memory sizes also increase by a factor 16, since we already
have trouble with today's memory sizes.

> Both approaches actually would work in tandem.

Hence, this patch series.

> > TODO:
> > - have any mlocked and ramfs pages live off of the LRU list,
> > so we do not need to scan these pages
>
> I think that is the most urgent issue at hand. At least for us.

For some workloads this is the most urgent change, indeed.
Since the patches for this already exist, integrating them
is at the top of my list. Expect this to be integrated into
the split VM patch series by the end of this week.

> > - switch to SEQ replacement for the anon LRU lists, so the
> > worst case number of pages to scan is reduced greatly.
>
> No idea what that is?

See http://linux-mm.org/PageReplacementDesign

> > - figure out if the file LRU lists need page replacement
> > changes to help with worst case scenarios
>
> We do not have an accepted standard load. So how would we figure that one
> out?

The current worst case is where we need to scan all of memory,
just to find a few pages we can swap out. With the effects of
lock contention figured in, this can take hours on huge systems.

In order to make the VM more scalable, we need to find acceptable
pages to swap out with low complexity in the VM. The "worst case"
above refers to the upper bound on how much work the VM needs to
do in order to get something evicted from the page cache or swapped
out.

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/