Re: O_DIRECT question

From: Bill Davidsen
Date: Fri Jan 12 2007 - 09:58:42 EST


Aubrey wrote:
On 1/12/07, Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
Linus Torvalds wrote:
>
> On Fri, 12 Jan 2007, Nick Piggin wrote:
>
>>We are talking about about fragmentation. And limiting pagecache to try to
>>avoid fragmentation is a bandaid, especially when the problem can be solved
>>(not just papered over, but solved) in userspace.
>
>
> It's not clear that the problem _can_ be solved in user space.
>
> It's easy enough to say "never allocate more than a page". But it's often
> not REALISTIC.
>
> Very basic issue: the perfect is the enemy of the good. Claiming that
> there is a "proper solution" is usually a total red herring. Quite often
> there isn't, and the "paper over" is actually not papering over, it's
> quite possibly the best solution there is.

Yeah *smallish* higher order allocations are fine, and we use them all the
time for things like stacks or networking.

But Aubrey (who somehow got removed from the cc list) wants to do order 9
allocations from userspace in his nommu environment. I'm just trying to be
realistic when I say that this isn't going to be robust and a userspace
solution is needed.

Hmm..., aside from big order allocations from user space, if there is
a large application we need to run, it should be loaded into the
memory, so we have to allocate a big block to accommodate it. kernel
fun like load_elf_fdpic_binary() etc will request contiguous memory,
then if vfs eat up free memory, loading fails.
Before we had virtual memory we had only a base address register, start at this location and go thus far, and user program memory had to be contiguous. To change a program size, all other programs might be moved, either by memory copy or actual swap to disk if total memory became a problem. To minimize the pain, programs were loaded at one end of memory, and system buffers and such were allocated at the other. That allowed the most recently loaded program the best chance of being able to grow without thrashing.

The point is that if you want to be able to allocate at all, sometimes you will have to write dirty pages, garbage collect, and move or swap programs. The hardware is just too limited to do something less painful, and the user can't see memory to do things better. Linus is right, 'Claiming that there is a "proper solution" is usually a total red herring. Quite often there isn't, and the "paper over" is actually not papering over, it's quite possibly the best solution there is.' I think any solution is going to be ugly, unfortunately.

--
bill davidsen <davidsen@xxxxxxx>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/