Re: [1/8,v3] NUMA Hotplug Emulator: add function to hide memoryregion via e820 table.

From: David Rientjes
Date: Sat Nov 20 2010 - 19:45:23 EST


On Fri, 19 Nov 2010, Shaohui Zheng wrote:

> > > > > Index: linux-hpe4/arch/x86/kernel/e820.c
> > > > > ===================================================================
> > > > > --- linux-hpe4.orig/arch/x86/kernel/e820.c 2010-11-15 17:13:02.483461667 +0800
> > > > > +++ linux-hpe4/arch/x86/kernel/e820.c 2010-11-15 17:13:07.083461581 +0800
> > > > > @@ -971,6 +971,7 @@
> > > > > }
> > > > >
> > > > > static int userdef __initdata;
> > > > > +static u64 max_mem_size __initdata = ULLONG_MAX;
> > > > >
> > > > > /* "mem=nopentium" disables the 4MB page tables. */
> > > > > static int __init parse_memopt(char *p)
> > > > > @@ -989,12 +990,28 @@
> > > > >
> > > > > userdef = 1;
> > > > > mem_size = memparse(p, &p);
> > > > > - e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1);
> > > > > + e820_remove_range(mem_size, max_mem_size - mem_size, E820_RAM, 1);
> > > > > + max_mem_size = mem_size;
> > > > >
> > > > > return 0;
> > > > > }
> > > >
> > > > This needs memmap= support as well, right?
> > > we did not do the testing after combine both memmap and numa=hide paramter,
> > > I think that the result should similar with mem=XX, they both remove a memory
> > > region from the e820 table.
> > >
> >
> > You've modified the parser for mem= but not memmap= so the change needs
> > additional support for the latter.
> >
>
> the parser for mem= is not modified, the changed parser is numa=, I add a addtional
> option numa=hide=.
>

The above hunk is modifying the x86 parser for the mem= parameter.

> > Your patchset doesn't do that, I'm talking specifically about the amount
> > of memory left behind so that the kernel at least still boots. That seems
> > to be a function of e820_hide_mem() to do some sanity checking so we
> > actually still get a kernel rather than the responsibility of the
> > command-line parser.
>
> How much memory is enough to make sure the kernel can still boot, it is very
> hard to measure. it is almost impossible to get the exact data. I try to leave very
> few memory to kernel(hide most memory with numa=hide), it cause a panic directly.
>
> I have no idea about it, do you have any suggestions?
>

Yes, I think we should use FAKE_NODE_MIN_SIZE to represent the smallest
node that may be added and so the appropriate behavior or e820_hide_mem()
would be to leave at least this quantity behind for the kernel to be
loaded.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/