Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Martin J. Bligh
Date: Thu Nov 03 2005 - 12:51:29 EST


--Linus Torvalds <torvalds@xxxxxxxx> wrote (on Thursday, November 03, 2005 09:19:35 -0800):

>
>
> On Thu, 3 Nov 2005, Martin J. Bligh wrote:
>>
>> The problem is how these zones get resized. Can we hotplug memory between
>> them, with some sparsemem like indirection layer?
>
> I think you should be able to add them. You can remove them. But you can't
> resize them.
>
> And I suspect that by default, there should be zero of them. Ie you'd have
> to set them up the same way you now set up a hugetlb area.

So ... if there are 0 by default, and I run for a while and dirty up
memory, how do I free any pages up to put into them? Not sure how that
works.

Going back to finding contig pages for a sec ... I don't disagree with
your assertion that order 1 is doable (however, we do need to make one
fix ...see below). It's > 1 that's a problem.

For amusement, let me put in some tritely oversimplified math. For the
sake of arguement, assume the free watermarks are 8MB or so. Let's assume
a clean 64-bit system with no zone issues, etc (ie all one zone). 4K pages.
I'm going to assume random distribution of free pages, which is
oversimplified, but I'm trying to demonstrate a general premise, not get
accurate numbers.

8MB = 2048 pages.

On a 64MB system, we have 16384 pages, 2048 free. Very rougly speaking, for
each free page, chance of it's buddy being free is 2048/16384. So in
grossly-oversimplified stats-land, if I can remember anything at all,
chance of finding one page with a free buddy is 1-(1-2048/16384)^2048,
which is, for all intents and purposes ... 1.

1 GB. system, 262144 pages 1-(1-2048/16384)^2048 = 0.9999989

128GB system. 33554432 pages. 0.1175 probability

yes, yes, my math sucks and I'm a simpleton. The point is that as memory
gets bigger, the odds suck for getting contiguous pages. And would also
explain why you think there's no problem, and I do ;-) And bear in mind
that's just for order 1 allocs. For bigger stuff, it REALLY sucks - I'll
spare you more wild attempts at foully-approximated math.

Hmmm. If we keep 128MB free, that totally kills off the above calculation
I think I'll just tweak it so the limit is not so hard on really big
systems. Will send you a patch. However ... larger allocs will still
suck ... I guess I'd better gross you out with more incorrect math after
all ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/