Re: [00/41] Large Blocksize Support V7 (adds memmap support)

From: Nick Piggin
Date: Tue Sep 11 2007 - 14:47:36 EST


On Wednesday 12 September 2007 04:25, Maxim Levitsky wrote:

> Hi,
>
> I think that fundamental problem is no fragmentation/large pages/...
>
> The problem is the VM itself.
> The vm doesn't use virtual memory, thats all, that the problem.
> Although this will be probably linux 3.0, I think that the right way to
> solve all those problems is to make all kernel memory vmalloced (except few
> areas like kernel .text)
>
> It will suddenly remove the buddy allocator, it will remove need for
> highmem, it will allow to allocate any amount of memory (for example 4k
> stacks will be obsolete)
> It will even allow kernel memory to be swapped to disk.
>
> This is the solution, but it is very very hard.

I'm not sure that it is too hard. OK it is far from trivial...

This is not a new idea though, it has been floated around for a long
time (since before Linux I'm sure, although have no references).

There are lots of reasons why such an approach has fundamental
performance problems too, however. Your kernel can't use huge tlbs
for a lot of memory, you can't find the physical address of a page
without walking page tables, defragmenting still has a significant
cost in terms of moving pages and flushing TLBs etc.

So the train of thought up to now has been that a virtually mapped
kernel would be "the problem with the VM itself" ;)

We're actually at a point now where higher order allocations are
pretty rare and not a big problem (except with very special cases
like hugepages and memory hotplug which can mostly get away
with compromises, so we don't want to turn over the kernel just
for these).

So in my opinion, any increase of the dependence on higher order
allocations is simply a bad move until a killer use-case can be found.
They move us further away from good behaviour on our assumed
ideal of an identity mapped kernel.

(I don't actually dislike the idea of virtually mapped kernel. Maybe
hardware trends will favour that model and there are some potential
simple instructions a CPU can implement to help with some of the
performance hits. I'm sure it will be looked at again for Linux one day)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/