Dear Rick,
I'm writing to you with some ideas for Linux memory management.
There are two things which I think would be really nice.
1. Imagine you're running a Linux binary off a umsdos or nfs or
smbfs (read *slow*) volume. When memory gets tight, the kernel
starts looking for pages to throw away to meet requests. For
ordinary binaries, the first thing it throws away are the read-only
program text pages. This is because they don't need to be written
- they can just be re-read. But if you're running off a slow
volume, the cost of rereading the page is a lot higher - often
higher in fact than if that read-only page had been swapped to a
(raw, local) swap partition and re-read quickly from there.
So the goal is, when the executable comes off a slow volume,
make the cost of discarding the page higher than the cost of
swapping it, so that pages get swapped locally instead of being
discarded and then later reread. Maybe a mount option could
be used to express how costly (in terms of slowness) the volume is.
2. I know free RAM is bad, but....
Linux's swapping performance has been steadily increasing over
the years, but there's still something I'd like to see that I
think would help a lot. You can perform this experiment to verify
what I'm saying.
Let's say I'm on a machine with 16Meg of RAM. Then I add some
swap space. If I never use up all of that 16Meg of RAM, the
swap never gets touched. All memory not being used by programs
is free to be used as buffer cache.
Now let's say I've used all except the last 512k. When I do some
file reading, that memory gets used to cache my file. But if my
fileset is larger than the buffer cache, the advantage of the
cache is lost, because the data won't fit within the cache.
Now imagine you run a new program which takes 10 Mb of memory.
This of course will cause swapping. But think of the situation
after that new program has finished: All the least-used program
pages have left real RAM, leaving maybe 6-7Mb of real RAM available
for use as buffer cache. The fileset stands a better chance of
fitting, and my future performance is improved.
Why should a file page I read in ten seconds ago need to be reread,
just because a 10-minute-old program page took up RAM that could
have been used as file cache?
AFAIK at the moment the only time swapping happens is if there
isn't enough real RAM to satisfy a request - even if the pages
of program text are older than the file being cached. What would
be nice is if swapping was considered when the file is read,
so that little-used program pages get swapped out to provide a
larger file cache area. This could even be done with a daemon:
When the machine is idle, any page used by programs which has not
been used in the last minute gets either dumped or swapped.
If I do a top when the machine is idle, the amount of free RAM
should increase over time as unused pages are swapped out to a
disk which is otherwise not being used.
So the idea is, consider swapping or dumping stuff out _well_ before
real RAM is exhausted. The performance gain from not having to
re-read a file page is likely to be higher than the loss from
having to swap out a page.
In practise, when I'm low on real RAM, I start Emacs then quit it.
This does as we described above, and the performance benefit
afterwards is really worth it.
Just my two cents worth. I would appreciate the ideas of other about
this.
Mitch.