Re: [RFC][PATCH 00/26] sched/numa

From: Andrea Arcangeli
Date: Mon Mar 19 2012 - 09:58:57 EST


On Mon, Mar 19, 2012 at 02:26:31PM +0100, Peter Zijlstra wrote:
> On Mon, 2012-03-19 at 14:04 +0100, Andrea Arcangeli wrote:
> > If you boot with memcg compiled in, that's taking an equivalent amount
> > of memory per-page.
> >
> > If you can bear the memory loss when memcg is compiled in even when
> > not enabled, you sure can bear it on NUMA systems that have lots of
> > memory, so it's perfectly ok to sacrifice a bit of it so that it
> > performs like not-NUMA but you still have more memory than not-NUMA.
> >
> I think the overhead of memcg is quite insane as well. And no I cannot
> bear that and have it disabled in all my kernels.
>
> NUMA systems having lots of memory is a false argument, that doesn't
> mean we can just waste tons of it, people pay good money for that
> memory, they want to use it.
>
> I fact, I know that HPC people want things like swap-over-nfs so they
> can push infrequently running system crap out into swap so they can get
> these few extra megabytes of memory. And you're proposing they give up
> ~100M just like that?

With your code they will get -ENOMEM from split_vma and a slowdown in
all regular page faults and vma mangling operations, before they run
out of memory...

The per-page memory loss is 24bytes, AutoNUMA in page terms costs 0.5%
of ram (and only if booted on NUMA hardware, unless noautonuma is
passed as parameter), and I can't imagine that to be a problem on a
system where hardware vendor took shortcuts to install massive amounts
of RAM that is fast to access only locally. If you buy that kind of
hardware losing the cost of 0.5% of RAM of it, is ridiculous compared
to the programmer cost of patching all apps.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/