[PATCH 0/3] Fadvise: Directory level page cache cleaning support

From: Li Wang
Date: Mon Dec 30 2013 - 08:47:04 EST

VFS relies on LRU-like page cache eviction algorithm
to reclaim cache space, such general and simple algorithm
is good regarding its application independence, and is working
for normal situations. However, sometimes it does not help much
for those applications which are performance sensitive or under
heavy loads. Since LRU may incorrectly evict going-to-be referenced
pages out, resulting in severe performance degradation due to
cache thrashing. Applications have the most knowledge
about the things they are doing, they can always do better if
they are given a chance. This motivates to endow the applications
more abilities to manipulate the page cache.

Currently, Linux support file system wide cache cleaing by virtue of
proc interface 'drop-caches', but it is very coarse granularity and
was originally proposed for debugging. The other is to do file-level
page cache cleaning through 'fadvise', however, this is sometimes less
flexible and not easy to use especially in directory wide operations or
under massive small-file situations.

This patch extends 'fadvise' to support directory level page cache
cleaning. The call to posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)
with 'fd' referring to a directory will recursively reclaim page cache
entries of files inside 'fd'. For secruity concern, those inodes
which the caller does not own appropriate permissions will not
be manipulated.

It is easy to demonstrate the advantages of directory level page
cache cleaning. We use a machine with a Pentium(R) Dual-Core CPU
E5800 @ 3.20GHz, and with 2GB memory. Two directories named '1'
and '3' are created, with each containing X (360 - 460) files,
and each file with a size of 2MB. The test scripts are as follows,

The test scripts (without cache cleaning)
cp -r 1 2
cp -r 3 4
time grep "data" 1/*

The time on 'grep "data" 1/*' is measured
with/without cache cleaning, under different file counts.
With cache cleaning, we clean all cache entries of files
in '2' before doing 'cp -r 3 4' by using pretty much
the following two statements,
fd = open("2", O_DIRECTORY, 0644);
posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);

The results are as follows (in seconds),
X: Number of files inside each directory

X Without Cleaning With Cleaning
360 2.385 1.361
380 3.159 1.466
400 3.972 1.558
420 4.823 1.548
440 5.798 1.702
460 6.888 2.197

The page cache is not large enough to buffer all the four
directories, so 'cp -r 3 4' will result in some
entries of '1' to be evicted (due to LRU). When re-accessing '1',
some entries need be reloaded from disk, which is time-consuming.
In this case, cleaning '2' before 'cp -r 3 4' enjoys a good

Li Wang (3):
VFS: Add the declaration of shrink_pagecache_parent
Add shrink_pagecache_parent
Fadvise: Add the ability for directory level page cache cleaning

fs/dcache.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/dcache.h | 1 +
mm/fadvise.c | 4 ++++
3 files changed, 41 insertions(+)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/