Re: boot time node and memory limit options

From: Jesse Barnes
Date: Wed Mar 17 2004 - 12:54:36 EST


On Wed, Mar 17, 2004 at 09:09:45AM -0800, Dave Hansen wrote:
> Every arch has its own way of describing its layout. Some use "chunks"
> and others like ppc64 use LMB (logical memory blocks). If each arch was
> willing to store their memory layout information in a generic way, then
> we might have a shot at doing a generic mem= or a NUMA version.

>
> I coded this up a few days ago to see if I could replace the x440 SRAT
> chunks with it. I never got around to actually doing that part, but
> something like this is what we need to do *layout* manipulation in an
> architecture-agnostic way.
>
> I started coding this before I thought *too* much about it. What I want
> is a way to get rid of all of the crap that each architecture (and
> subarch) have to store their physical memory layout. On normal x86 we
> have the e820 and the EFI tables and on Summit/x440, we have yet another
> way to do it.

In some cases (ia64 for example) there are additional restrictions on
each memory chunk. For example, the EFI memory map may describe a
contiguous chunk of memory 28MB in size, but if your kernel page size
was set to 64MB, you'd have to throw it away as unusable. Should that
be dealt with in the arch independent code (i.e. is similar stuff done
on other platforms?) or is it best to only add sections that are usable?

> What I'd like to do is present a standard way for all of these
> architectures to store the information that they need to record at boot
> time, plus make something flexible enough that we can use it for stuff
> at runtime when hotplug memory is involved.

That would be great, what you have below seems sensible.

> The code I'd like to see go away from boot-time is anything that deals
> with arch-specific structures like the e820, functions like
> lmb_end_of_DRAM(), or any code that deals with zholes. I'd like to get
> it to a point where we can do a mostly arch-independent mem=.

So what you have here would be only for boot time setup, while
CONFIG_NONLINEAR would be used in lieu of multiple pgdats per node or a
virtual memmap in the case of intranode discontiguous memory?

Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/