Some thoughts about cache and swap

From: Lasse KÃrkkÃinen / Tronic
Date: Sat Jun 05 2004 - 13:17:25 EST


One thing where most current cache/swap implementations seem to fail miserably is when user - for some reason - processes large amounts of data. This may be as simple as listening large MP3 collection for few hours (having very large working set, and pushing everything else out of RAM, replacing that with cached songs).

Incidentally, in such cases, swapping the content is usually complete waste, as it is unlikely that the same data is needed again, or if it actually is required, the speed may not be any kind of issue.

In order to make better use of the limited cache space, the following methods could be used:

1. highly preferable to cache small files only
* big seek latency of disk/net access, small RAM usage of caching
* applications with long loading times usually use big number of tiny
files => caching those makes response times a lot better
* higher probability of getting more hits (per consumed cache space)

1.1. if caching large files anyway
* try to detect access type (sequential, highly spatial or random)
* only cache the hottest parts of the file

2. only cache files where I/O is the bottle neck
* if applications typically don't need the data faster, having it in
cache isn't very useful either
* detecting whether I/O is a limiting factor is difficult

Additionally, for machines with oversized RAM (like nearly all desktop/server computers):

3. never (or only rarely) swap out applications for more cache
* eventually it will be restored to RAM and the user will notice
major trashing with long delays, and blame the OS
* applications only take small portion of the RAM and using that
as extra cache makes only small difference in cache performance
* if application or its data has been loaded to memory, there normally
is a reason for that (i.e. the data needs to be accessed quickly)

3.1. memory leaks are exception (but maybe fixing the bug would be correct solution instead of obscuring the problem by swapping it out)

Definition of large file changes over time, as technology evolves
* size of RAM (and thus the available cache space)
* reading one megaoctet or doing single seek on modern HDD each consume
roughly the same time - about 13 ms (notice how evil seeking is!)

- Tronic -

Attachment: signature.asc
Description: OpenPGP digital signature