[PATCH] memory limit bugfix & documentation update

Rik van Riel (H.H.vanRiel@phys.uu.nl)
Tue, 21 Apr 1998 12:41:55 +0200 (MET DST)


Hi Linus,

Here's a small patch that does the following things:
- Change the swapout policy from RCL_PERSIST to RCL_ROUNDROBIN
so that the [buffermem,pagecache].borrow_percent limits
actually work the way they were intended.
- Update the inline & external documentation on the code &
/proc/sys/vm/freepages (which is unused now, but likely
to be re(ab)used in the near future).
- It fixes a typo in mm/vmscan.c so that pagecache.max_percent
works again.
- It increases kswapd's bandwidth to what it was before I
introduced the variable agressiveness (and somewhat more
when we're low on memory). The kswapd performance problems
should be gone, or at least alleviated, now.

The patch applies cleanly against 2.1.96 and 2.1.97 (since
no mm/ files changed).

Since it's basically a bugfix & documentation update, it
should be applied even under the current feature-freeze.

Rik.
+-------------------------------------------+--------------------------+
| Linux: - LinuxHQ MM-patches page | Scouting webmaster |
| - kswapd ask-him & complain-to guy | Vries cubscout leader |
| http://www.phys.uu.nl/~riel/ | <H.H.vanRiel@phys.uu.nl> |
+-------------------------------------------+--------------------------+

--- linux/mm/vmscan.c.pol Tue Apr 21 12:14:40 1998
+++ linux/mm/vmscan.c Tue Apr 21 12:25:34 1998
@@ -458,17 +458,17 @@
switch (state) {
do {
case 0:
+ state = 1;
if (shrink_mmap(i, gfp_mask))
return 1;
- state = 1;
case 1:
+ state = 2;
if ((gfp_mask & __GFP_IO) && shm_swap(i, gfp_mask))
return 1;
- state = 2;
default:
+ state = 0;
if (swap_out(i, gfp_mask))
return 1;
- state = 0;
i--;
} while ((i - stop) >= 0);
}
@@ -553,23 +553,17 @@
* more aggressive if we're really
* low on free memory.
*
- * Normally this is called 4 times
- * a second if we need more memory,
- * so this has a normal rate of
- * X*4 pages of memory free'd per
- * second. That rate goes up when
- *
- * - we're really low on memory (we get woken
- * up a lot more)
- * - other processes fail to allocate memory,
- * at which time they try to do their own
- * freeing.
- *
- * A "tries" value of 50 means up to 200 pages
- * per second (1.6MB/s). This should be a /proc
- * thing.
+ * The number of tries is 512 divided by an
+ * 'urgency factor'. In practice this will mean
+ * a value of 512 / 8 = 64 pages at a time,
+ * giving 64 * 4 (times/sec) * 4k (pagesize) =
+ * 1 MB/s in lowest-priority background
+ * paging. This number rises to 8 MB/s when the
+ * priority is highest (but then we'll be woken
+ * up more often and the rate will be even higher).
+ * -- Should make this sysctl tunable...
*/
- tries = (50 << 2) >> free_memory_available(3);
+ tries = (512) >> free_memory_available(3);

while (tries--) {
int gfp_mask;
@@ -622,7 +616,7 @@

if ((long) (now - want) >= 0) {
if (want_wakeup || (num_physpages * buffer_mem.max_percent) < (buffermem >> PAGE_SHIFT) * 100
- || (num_physpages * page_cache.max_percent < page_cache_size)) {
+ || (num_physpages * page_cache.max_percent < page_cache_size * 100)) {
/* Set the next wake-up time */
next_swap_jiffies = now + swapout_interval;
wake_up(&kswapd_wait);
--- linux/mm/page_alloc.c.pol Tue Apr 21 13:17:26 1998
+++ linux/mm/page_alloc.c Tue Apr 21 12:33:34 1998
@@ -108,17 +108,6 @@
* but this had better return false if any reasonable "get_free_page()"
* allocation could currently fail..
*
- * Currently we approve of the following situations:
- * - the highest memory order has two entries
- * - the highest memory order has one free entry and:
- * - the next-highest memory order has two free entries
- * - the highest memory order has one free entry and:
- * - the next-highest memory order has one free entry
- * - the next-next-highest memory order has two free entries
- *
- * [previously, there had to be two entries of the highest memory
- * order, but this lead to problems on large-memory machines.]
- *
* This will return zero if no list was found, non-zero
* if there was memory (the bigger, the better).
*/
@@ -129,13 +118,14 @@
struct free_area_struct * list;

/*
- * If we have more than about 6% of all memory free,
+ * If we have more than about 3% to 5% of all memory free,
* consider it to be good enough for anything.
* It may not be, due to fragmentation, but we
* don't want to keep on forever trying to find
* free unfragmented memory.
+ * Added low/high water marks to avoid thrashing -- Rik.
*/
- if (nr_free_pages > num_physpages >> 4)
+ if (nr_free_pages > (num_physpages >> 5) + (nr ? 0 : num_physpages >> 6))
return nr+1;

list = free_area + NR_MEM_LISTS;
--- linux/Documentation/sysctl/vm.txt.pol Tue Apr 21 12:28:33 1998
+++ linux/Documentation/sysctl/vm.txt Tue Apr 21 12:31:00 1998
@@ -94,7 +94,8 @@

The three values in this file correspond to the values in
the struct buffer_mem. It controls how much memory should
-be used for buffer memory.
+be used for buffer memory. The percentage is calculated
+as a percentage of total system memory.

The values are:
min_percent -- this is the minumum percentage of memory
@@ -111,29 +112,9 @@
This file contains the values in the struct freepages. That
struct contains three members: min, low and high.

-These numbers are used by the VM subsystem to keep a reasonable
-number of pages on the free page list, so that programs can
-allocate new pages without having to wait for the system to
-free used pages first. The actual freeing of pages is done
-by kswapd, a kernel daemon.
-
-min -- when the number of free pages reaches this
- level, only the kernel can allocate memory
- for _critical_ tasks only
-low -- when the number of free pages drops below
- this level, kswapd is woken up immediately
-high -- this is kswapd's target, when more than <high>
- pages are free, kswapd will stop swapping.
-
-When the number of free pages is between low and high,
-and kswapd hasn't run for swapout_interval jiffies, then
-kswapd is woken up too. See swapout_interval for more info.
-
-When free memory is always low on your system, and kswapd has
-trouble keeping up with allocations, you might want to
-increase these values, especially high and perhaps low.
-I've found that a 1:2:4 relation for these values tend to work
-rather well in a heavily loaded system.
+These variables are currently unused (?), but they're
+very likely to be abused for something else in the near
+future, so don't yet remove it from the source...

==============================================================

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu