Re: ~500 megs cached yet 2.6.5 goes into swap hell

From: Nikita Danilov
Date: Fri Apr 30 2004 - 08:19:45 EST


Rik van Riel writes:
> On Fri, 30 Apr 2004, Nick Piggin wrote:
> > Rik van Riel wrote:
>
> > > The basic idea of use-once isn't bad (search for LIRS and
> > > ARC page replacement), however the Linux implementation
> > > doesn't have any of the checks and balances that the
> > > researched replacement algorithms have...
>
> > No, use once logic is good in theory I think. Unfortunately
> > our implementation is quite fragile IMO (although it seems
> > to have been "good enough").
>
> Hey, that's what I said ;))))
>
> > This is what I'm currently doing (on top of a couple of other
> > patches, but you get the idea). I should be able to transform
> > it into a proper use-once logic if I pick up Nikita's inactive
> > list second chance bit.
>
> Ummm nope, there just isn't enough info to keep things
> as balanced as ARC/LIRS/CAR(T) can do. No good way to
> auto-tune the sizes of the active and inactive lists.

While keeping "history" for non-resident pages is very good from many
points of view (provides infrastructure for local replacement and
working set tuning, for example) and in the long term, current scanner
can still be improved somewhat.

Here are results that I obtained some time ago. Test is to concurrently
clone (bk) and build (make -jN) kernel source in M directories.

For N = M = 11, TIMEFORMAT='%3R %3S %3U'

REAL SYS USER
"stock" 3818.320 568.999 4358.460
transfer-dirty-on-refill 3368.690 569.066 4377.845
check-PageSwapCache-after-add-to-swap 3237.632 576.208 4381.248
dont-unmap-on-pageout 3207.522 566.539 4374.504
async-writepage 3115.338 562.702 4325.212

(check-PageSwapCache-after-add-to-swap was added to mainline since them.)

These patches weren't updated for some time. Last version is at
ftp://ftp.namesys.com/pub/misc-patches/unsupported/extra/2004.03.25-2.6.5-rc2

[from Nick Piggin's patch]
>
> Changes mark_page_accessed to only set the PageAccessed bit, and
> not move pages around the LRUs. This means we don't have to take
> the lru_lock, and it also makes page ageing and scanning consistient
> and all handled in mm/vmscan.c

By the way, batch-mark_page_accessed patch at the URL above also tries
to reduce lock contention in mark_page_accessed(), but through more
standard approach of batching target pages in per-cpu pvec.

Nikita.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/