Re: [RFC][PATCH] big continuous memory allocator v2

From: KAMEZAWA Hiroyuki
Date: Tue Sep 07 2010 - 05:09:16 EST


On Tue, 7 Sep 2010 10:46:35 +0200
Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> On Tue, 7 Sep 2010 17:25:59 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>
> > On Tue, 07 Sep 2010 09:29:21 +0200
> > Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >
> > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> writes:
> > >
> > > > This is a page allcoator based on memory migration/hotplug code.
> > > > passed some small tests, and maybe easier to read than previous
> > > > one.
> > >
> > > Maybe I'm missing context here, but what is the use case for this?
> > >
> >
> > I hear some drivers want to allocate xxMB of continuous area.(camera?)
> > Maybe embeded guys can answer the question.
>
> Ok what I wanted to say -- assuming you can make this work
> nicely, and the delays (swap storms?) likely caused by this are not
> too severe, it would be interesting for improving the 1GB pages on x86.
>

Oh, I didn't consider that. Hmm. If x86 really wants to support 1GB page,
MAX_ORDER should be raised. (I'm sorry if it was already disccused.)


> This would be a major use case and probably be enough
> to keep the code around.
>
> But it depends on how well it works.
>
Sure.

> e.g. when the zone is already fully filled how long
> does the allocation of 1GB take?
>
Maybe not very quick, even slow.

> How about when parallel programs are allocating/freeing
> in it too?
>
This code doesn't assume that. I wonder I should add mutex because this code
generates IPI for draining some per-cpu lists.

I think 1GB pages should be preallocated as current hugepage does.


> What's the worst case delay under stress?
>
memory offline itself is robust against stress because it make
pageblock ISOLATED. But memory allocation of 1GB is problem.
I have an idea (see below).

> Does it cause swap storms?
>
Maybe same as allocating 1GB of memory when memory is full.
It's LRU matter.


> One issue is also that it would be good to be able to decide
> in advance if the OOM killer is likely triggered (and if yes
> reject the allocation in the first place).
>

Checking the amount of memory and swap before starts ?
It sounds nice. I'd like to add something.

Or changing my patche's logic as..

1. allocates required migration target pages (of 1GB)
2. start migration to allocated pages.
3. create a big page.

Then, we can use some GFP_XXXX at (1) and can do some tuning as usual
vm codes.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/