[PATCH 0/6] [RFC] Fallocate Volatile Ranges v4

From: John Stultz
Date: Tue Jun 12 2012 - 21:13:37 EST


So after sending out the v3 iteration of this patch set, I noticed
that the approach taken there of using writepage to trigger the
volatile range purging would not be functional, as writepage isn't
called on systems without swap.

Given this, for this iteration I've reverted the patch set back to
using a shrinker. I've also included a patch to convert ashmem to
use volatile ranges, which almost cuts the size of the ashmem
driver in half.

Now, I've also gotten feedback from Kosaki-san that the shrinker
interface along with tracking volatile ranges via an lru is a poor
choice, as it is not NUMA aware. I've spent some time trying to
learn more about the VM internals, and I've included two HACK
patches that try to propose a way to address these concerns.

The basic idea is to remove the the volatile range lru management.
Instead leveraging the VM's page lru management, by deactivating
the entire range when marking it volatile. Then in the writepage
code we check if a page is part of a volatile range, and if so
we purge it. This provides LRU-like behavior for purging volatile
ranges (so we purge old volatile ranges before new ones), while
allowing the kernel to free memory first on a numa node that is
under pressure.

Of course, the issue I had w/ the v3 patchset still applies.
We never call shmem_writepage on a system without swap! So to
address this, I've create a quite hackish patch that tries to
manage the anonymous active and inactive lrus slightly differently.

The idea with this is on systems without swap, there's not much use
to keeping inactive and active anonymous lrus separately, since we
can't swap out any of the anonymous memory (at least this is my
assumption, and I may be wrong). So when we don't have swap, keep
all anonymous pages active. Thus, only pages that are deactivated
will be moved to the inactive lru. Then we change shrink_lruvec()
to try to writeout inactive anonymous pages. Of course, writepage
won't write anything out if there's no swap, but it will now
check if the page is volatile and purge it if appropriate.

So... does that sound reasonable or terrible?

Are there other approaches I should be trying?

Maybe the original shrinker approach isn't so bad after all? :)

Many thanks again to Dave Hansen who's been fending off my
numerous questions on irc for the last few days.

thanks
-john


CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
CC: Android Kernel Team <kernel-team@xxxxxxxxxxx>
CC: Robert Love <rlove@xxxxxxxxxx>
CC: Mel Gorman <mel@xxxxxxxxx>
CC: Hugh Dickins <hughd@xxxxxxxxxx>
CC: Dave Hansen <dave@xxxxxxxxxxxxxxxxxx>
CC: Rik van Riel <riel@xxxxxxxxxx>
CC: Dmitry Adamushko <dmitry.adamushko@xxxxxxxxx>
CC: Dave Chinner <david@xxxxxxxxxxxxx>
CC: Neil Brown <neilb@xxxxxxx>
CC: Andrea Righi <andrea@xxxxxxxxxxxxxxx>
CC: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
CC: Taras Glek <tgek@xxxxxxxxxxx>
CC: Mike Hommey <mh@xxxxxxxxxxxx>
CC: Jan Kara <jack@xxxxxxx>
CC: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxx>


John Stultz (6):
[RFC] Interval tree implementation
[RFC] Add volatile range management code
[RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE handlers
[RFC] ashmem: Convert ashmem to use volatile ranges
[RFC][HACK] tmpfs: Purge volatile ranges on writepage instead of
using shrinker
[RFC][HACK] mm: Change memory management of anonymous pages on
swapless systems

drivers/staging/android/ashmem.c | 331 +--------------------------
fs/open.c | 3 +-
include/linux/falloc.h | 7 +-
include/linux/intervaltree.h | 55 +++++
include/linux/pagevec.h | 5 +-
include/linux/swap.h | 23 ++-
include/linux/volatile.h | 40 ++++
lib/Makefile | 2 +-
lib/intervaltree.c | 119 ++++++++++
mm/Makefile | 2 +-
mm/shmem.c | 102 +++++++++
mm/swap.c | 13 +-
mm/vmscan.c | 9 -
mm/volatile.c | 467 ++++++++++++++++++++++++++++++++++++++
14 files changed, 828 insertions(+), 350 deletions(-)
create mode 100644 include/linux/intervaltree.h
create mode 100644 include/linux/volatile.h
create mode 100644 lib/intervaltree.c
create mode 100644 mm/volatile.c

--
1.7.3.2.146.gca209

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/