[PATCH] rmap 14

From: Rik van Riel (riel@conectiva.com.br)
Date: Thu Aug 15 2002 - 21:07:49 EST


This is a fairly minimal change for rmap14 since I've been
working on 2.5 most of the time. The experimental code in
this version is a hopefully smarter page_launder() that
shouldn't do much more IO than needed and hopefully gets
rid of the stalls that people have seen during heavy swap
activity. Please test this version. ;)

The first release of the 14th version of the reverse
mapping based VM is now available.
This is an attempt at making a more robust and flexible VM
subsystem, while cleaning up a lot of code at the same time.
The patch is available from:

           http://surriel.com/patches/2.4/2.4.19-rmap14
and http://surriel.com/patches/2.4/incr/rmap13c-rmap14
and http://linuxvm.bkbits.net/

My big TODO items for a next release are:
  - O(1) page launder - currently functional but slow, needs to be tuned
  - pte-highmem

rmap 14:
  - get rid of stalls during swapping, hopefully (me)
  - low latency zap_page_range (Robert Love)
rmap 13c:
  - add wmb() to wakeup_memwaiters (Arjan van de Ven)
  - remap_pmd_range now calls pte_alloc with full address (Paul Mackerras)
  - #ifdef out pte_chain_lock/unlock on UP machines (Andrew Morton)
  - un-BUG() truncate_complete_page, the race is expected (Andrew Morton, me)
  - remove NUMA changes from rmap13a (Christoph Hellwig)
rmap 13b:
  - prevent PF_MEMALLOC recursion for higher order allocs (Arjan van de Ven, me)
  - fix small SMP race, PG_lru (Hugh Dickins)
rmap 13a:
  - NUMA changes for page_address (Samuel Ortiz)
  - replace vm.freepages with simpler kswapd_minfree (Christoph Hellwig)
rmap 13:
  - rename touch_page to mark_page_accessed and uninline (Christoph Hellwig)
  - NUMA bugfix for __alloc_pages (William Irwin)
  - kill __find_page (Christoph Hellwig)
  - make pte_chain_freelist per zone (William Irwin)
  - protect pte_chains by per-page lock bit (William Irwin)
  - minor code cleanups (me)
rmap 12i:
  - slab cleanup (Christoph Hellwig)
  - remove references to compiler.h from mm/* (me)
  - move rmap to marcelo's bk tree (me)
  - minor cleanups (me)
rmap 12h:
  - hopefully fix OOM detection algorithm (me)
  - drop pte quicklist in anticipation of pte-highmem (me)
  - replace andrea's highmem emulation by ingo's one (me)
  - improve rss limit checking (Nick Piggin)
rmap 12g:
  - port to armv architecture (David Woodhouse)
  - NUMA fix to zone_table initialisation (Samuel Ortiz)
  - remove init_page_count (David Miller)
rmap 12f:
  - for_each_pgdat macro (William Lee Irwin)
  - put back EXPORT(__find_get_page) for modular rd (me)
  - make bdflush and kswapd actually start queued disk IO (me)
rmap 12e
  - RSS limit fix, the limit can be 0 for some reason (me)
  - clean up for_each_zone define to not need pgdata_t (William Lee Irwin)
  - fix i810_dma bug introduced with page->wait removal (William Lee Irwin)
rmap 12d:
  - fix compiler warning in rmap.c (Roger Larsson)
  - read latency improvement (read-latency2) (Andrew Morton)
rmap 12c:
  - fix small balancing bug in page_launder_zone (Nick Piggin)
  - wakeup_kswapd / wakeup_memwaiters code fix (Arjan van de Ven)
  - improve RSS limit enforcement (me)
rmap 12b:
  - highmem emulation (for debugging purposes) (Andrea Arcangeli)
  - ulimit RSS enforcement when memory gets tight (me)
  - sparc64 page->virtual quickfix (Greg Procunier)
rmap 12a:
  - fix the compile warning in buffer.c (me)
  - fix divide-by-zero on highmem initialisation DOH! (me)
  - remove the pgd quicklist (suspicious ...) (DaveM, me)
rmap 12:
  - keep some extra free memory on large machines (Arjan van de Ven, me)
  - higher-order allocation bugfix (Adrian Drzewiecki)
  - nr_free_buffer_pages() returns inactive + free mem (me)
  - pages from unused objects directly to inactive_clean (me)
  - use fast pte quicklists on non-pae machines (Andrea Arcangeli)
  - remove sleep_on from wakeup_kswapd (Arjan van de Ven)
  - page waitqueue cleanup (Christoph Hellwig)
rmap 11c:
  - oom_kill race locking fix (Andres Salomon)
  - elevator improvement (Andrew Morton)
  - dirty buffer writeout speedup (hopefully ;)) (me)
  - small documentation updates (me)
  - page_launder() never does synchronous IO, kswapd
    and the processes calling it sleep on higher level (me)
  - deadlock fix in touch_page() (me)
rmap 11b:
  - added low latency reschedule points in vmscan.c (me)
  - make i810_dma.c include mm_inline.h too (William Lee Irwin)
  - wake up kswapd sleeper tasks on OOM kill so the
    killed task can continue on its way out (me)
  - tune page allocation sleep point a little (me)
rmap 11a:
  - don't let refill_inactive() progress count for OOM (me)
  - after an OOM kill, wait 5 seconds for the next kill (me)
  - agpgart_be fix for hashed waitqueues (William Lee Irwin)
rmap 11:
  - fix stupid logic inversion bug in wakeup_kswapd() (Andrew Morton)
  - fix it again in the morning (me)
  - add #ifdef BROKEN_PPC_PTE_ALLOC_ONE to rmap.h, it
    seems PPC calls pte_alloc() before mem_map[] init (me)
  - disable the debugging code in rmap.c ... the code
    is working and people are running benchmarks (me)
  - let the slab cache shrink functions return a value
    to help prevent early OOM killing (Ed Tomlinson)
  - also, don't call the OOM code if we have enough
    free pages (me)
  - move the call to lru_cache_del into __free_pages_ok (Ben LaHaise)
  - replace the per-page waitqueue with a hashed
    waitqueue, reduces size of struct page from 64
    bytes to 52 bytes (48 bytes on non-highmem machines) (William Lee Irwin)
rmap 10:
  - fix the livelock for real (yeah right), turned out
    to be a stupid bug in page_launder_zone() (me)
  - to make sure the VM subsystem doesn't monopolise
    the CPU, let kswapd and some apps sleep a bit under
    heavy stress situations (me)
  - let __GFP_HIGH allocations dig a little bit deeper
    into the free page pool, the SCSI layer seems fragile (me)
rmap 9:
  - improve comments all over the place (Michael Cohen)
  - don't panic if page_remove_rmap() cannot find the
    rmap in question, it's possible that the memory was
    PG_reserved and belonging to a driver, but the driver
    exited and cleared the PG_reserved bit (me)
  - fix the VM livelock by replacing > by >= in a few
    critical places in the pageout code (me)
  - treat the reclaiming of an inactive_clean page like
    allocating a new page, calling try_to_free_pages()
    and/or fixup_freespace() if required (me)
  - when low on memory, don't make things worse by
    doing swapin_readahead (me)
rmap 8:
  - add ANY_ZONE to the balancing functions to improve
    kswapd's balancing a bit (me)
  - regularize some of the maximum loop bounds in
    vmscan.c for cosmetic purposes (William Lee Irwin)
  - move page_address() to architecture-independent
    code, now the removal of page->virtual is portable (William Lee Irwin)
  - speed up free_area_init_core() by doing a single
    pass over the pages and not using atomic ops (William Lee Irwin)
  - documented the buddy allocator in page_alloc.c (William Lee Irwin)
rmap 7:
  - clean up and document vmscan.c (me)
  - reduce size of page struct, part one (William Lee Irwin)
  - add rmap.h for other archs (untested, not for ARM) (me)
rmap 6:
  - make the active and inactive_dirty list per zone,
    this is finally possible because we can free pages
    based on their physical address (William Lee Irwin)
  - cleaned up William's code a bit (me)
  - turn some defines into inlines and move those to
    mm_inline.h (the includes are a mess ...) (me)
  - improve the VM balancing a bit (me)
  - add back inactive_target to /proc/meminfo (me)
rmap 5:
  - fixed recursive buglet, introduced by directly
    editing the patch for making rmap 4 ;))) (me)
rmap 4:
  - look at the referenced bits in page tables (me)
rmap 3:
  - forgot one FASTCALL definition (me)
rmap 2:
  - teach try_to_unmap_one() about mremap() (me)
  - don't assign swap space to pages with buffers (me)
  - make the rmap.c functions FASTCALL / inline (me)
rmap 1:
  - fix the swap leak in rmap 0 (Dave McCracken)
rmap 0:
  - port of reverse mapping VM to 2.4.16 (me)

Rik

-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 15 2002 - 22:00:41 EST