Re: Please don't beat me up (was Re: Bugs and wishes in memory management area)

Kevin Buhr (buhr@stat.wisc.edu)
27 Nov 1996 14:06:36 -0600


-----BEGIN PGP SIGNED MESSAGE-----

Mike Jagdis <mike@roan.co.uk> writes:
|
| But you don't need to reorganize, you don't need paging and you don't
| need to emulate it.

If I understand correctly, the primary strength of the system you are
working on is that it wastes less memory in satisfying allocation
requests. As the user of a tiny little 8 meg machine, you can't
imagine how much I look forward to getting my filthy hands on this
code. ;)

However, the buddy system already does a good job of actually
satisfying small-allocation requests, even if it wastes lots of the
space it has available. That is to say, your new code has the effect
of magically boosting the size of physical memory, but the actual
failure rates of a "kmalloc(1024,GFP_KERNEL)" won't differ, even if
that call is more likely to cause a page fault under the buddy system
than under yours.

My problem is that the buddy system is very prone to fragmentation:
when memory becomes fragmented, it *often* completely fails to satisfy
large allocations (and worse yet, the current kernel mechanisms for
freeing pages make no attempt to address fragmentation problems---they
try to free "any old page" not "a page next to another free page").

It sounds like your system will be better at satisfying large
allocation requests, simply because it will be able to find, for
example, 9000 bytes starting *anywhere* (or almost anywhere) in memory
instead of requiring four consecutive pages. This seems to be a happy
side effect of your real goals: avoiding big chunks of wasted memory
and optimising the small-allocation case.

The proof is in the pudding. If your system makes my fragmentation
problem vanish, I will click my heels together with glee and do a
dance of joy. ;)

However, even after your code is in the kernel, it will still be
possible for a multipage allocation request to fail even though there
are plenty of pages all over the place. If this situation is very
unlikely, then it's not so bad. We can still imagine deadlock
scenarios (where every runnable process needs a 4-page block to
continue and yet allocated memory looks like a big checkerboard), but
if they are incredibly improbable, it's no big deal.

My proposed scheme is ugly, but it's for emergencies. Click, click,
click, click and four nonconsecutive pages have been mapped into the
land of make believe. One cache invalidate, and we've prevented
deadlock. Since the virtual address space is so damn big (ha ha), we
can make fragmentation of the "imaginary" memory impossible, simply by
mapping on 32-page boundaries (the size of the maximum allocation
currently allowed by "kmalloc"), and we never have to deal with the
ugliness (that Ingo was talking about) of reorganizing
already-allocated mappings.

It's cost? (1) Even if an emergency never occurs, every "kfree" will
need another check, but depending on the actual code your new
allocator uses, perhaps we'd be able to get this check done for free;
(2) It might break nasty virtual/physical address dependencies deep
inside some device driver somewhere despite the existence of a
GFP_VIRTOK flag; (3) It's ugly.

I guess we'll know whether these costs outweigh potential benefits
when we get some idea of how likely bad fragmentation is with your new
code in place. I certainly look forward to giving it a test run!

Kevin <buhr@stat.wisc.edu>

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3
Charset: noconv
Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface

iQBVAwUBMpyfSImVIQW1OgXhAQFgngIAklMJsDtBEJHo0gpil7+ynWmjXh6LHx4r
EOtpGVF7Sa/bfm77X5FUc1Fy8JM5DStd/V0CPfa1ER4rbIJizhJKmQ==
=NcdJ
-----END PGP SIGNATURE-----