> Date: Mon, 22 Apr 96 18:26 BST
> From: Jamie Lokier <jamie@rebellion.co.uk>
>
> I like. How about: gather statistics about working sets in the usual
> way, based on the usual page faults. Do it better, of course.
> Occasionally, unmap a few random pages from random processes, so that
> they trigger page faults and add to the statistics. The pages are still
> in the page cache, of course, so it doesn't take long to map them back.
> Although there is a speed penalty for this, the improved statistics, if
> used well, mean that you are far less likely to swap out pages that are
> actively in use. How else do you find the working set of a process
> without swapping it all to disk?
>
> Or the heuristic could work like this. When we take a page out of a
> processes VMA we add a "current-jiffies" marker to the vma when we do
> this. The next time we take a fault in that vma we see how long the
> jiffies have gone since we took that page away. Using this we decide
> later on whether it is a good idea to steal from that vma. And we
> re-juvenate these values or retest every so often.
>
> Later,
> David S. Miller
> davem@caip.rutgers.edu
>
Here is a more considered view of how this might be done, its the
most efficient and easiest way I can think of.
Swapping handles 3 differnt things:
1. Clear out unused pages - init code etc.
Done already (poorly) can be done slowly
results in page out (not swap) of clean
code segments.
Speculative unpaging is great for this,
every time memory starts getting short,
or load gets very low we should look for
these. When we actually need to swap, we
look first at pages who were unmapped ageas
ago and haven't been accessed since.
2. Get rid of most infrequently used pages.
3. Sequentially accessed data, too much to hold at once
e.g. image processing, matrix operations
Sometimes operations performed at a few lines of
a picture, working data window slides down image
Speculative unmapping copes with all this, and since a
heavy paging currently results in much idle time,
waiting on disc access, hopefully we'll have a lot
of spare time to do this.
Swapping policy: Swap pages who have been unmapped the longest,
by including with all pages a time reference, in jiffies, which
as well as the status (normal, unmapped, swapped) tells us when
it was swapped out, swapped in or last accessed (from unmapping) or when unmapped.
i.e. the last time it changed status.
How do we do the unmapping?
- Surely we can use most of the current swapping code, to unmap but
not swap out the processes access to a page of memory, generating a
page fault on access, where upon we give the page back and update
our last_accessed time.
How do we choose what to unmap?
- ?
We don't want to unmap pages too frequently, although the penalty in time
of unmapping a page is surely a lot less than swapping.
How often do we do the unmapping?
- ?
Perhaps all the time we're idle, the only penalty is an increase in the process
execution time as it slowly gets its pages back. However if this was happening
for every page the CPU accessed this wouldn't be good.
Implementing a mechanism to back off unmapping frequently used pages is hard.
If we unmap all memory with a (guess) penalty of 1ms overhead.
16M / 4k = 4096 pages. i.e. 4 seconds for whole sweep, this is a long time
we must do it in an order fashion, a bit at a time.
We can use the 'last time was known accessed' to determine what we unmap.
Maybe we sweep the page tables and unmap these pages, these will then
have there timers updated if and when they are again accessed.
(A fixed number per sweep) Only used pages will get re-mapped, so after a while
we'll just be unmapping frequently used pages - we must be careful not to do too
much, however my machine has a large idle time normally. Perhaps we should
decrease the number of pages scanned for when CPU is low.
In addition to the above swapping policy, if we have a whole load of sequential
pages all unused for a similarly long period of time we swap them all rather
than just individually. How do we decide how close is close who knows?
This will have to be tested and tuned. - Maybe we could unmap them all
simultaneously, unmap an entire process address space at once maybe. Seems
a bit rough on the process, not quite as bad as swapping it all though.
Doing all the above will guarantee we don't swap out frequently used pages,
although I don't have a heavy swapping system myself (16M Ram) so I don't
have much experience of what people do in very bad swap situations.
Really we ought to get some computer scientist to give some ideas, aren't you
meant to cover such algorithms in your courses? Has anyone got any references?
Or knows how other OS's do it any better?
Jon Burgess - 92jjb@eng.cam.ac.uk
Thanks.
.. . . . . . . . . . . . . . ..
:: : : Jon Burgess 01223-461907 : : ::
:: : jjb1003@cam.ac.uk : : ::
:: : : : : : : : : : : : : : ::