Re: memory hotremove prototype, take 3

From: IWAMOTO Toshihiro
Date: Wed Dec 03 2003 - 23:00:03 EST


At Wed, 03 Dec 2003 11:41:01 -0800,
Martin J. Bligh <mbligh@xxxxxxxxxxx> wrote:
>
> > this is a new version of my memory hotplug prototype patch, against
> > linux-2.6.0-test11.
> >
> > Freeing 100% of a specified memory zone is non-trivial and necessary
> > for memory hot removal. This patch splits memory into 1GB zones, and
> > implements complete zone memory freeing using kswapd or "remapping".
> >
> > A bit more detailed explanation and some test scripts are at:
> > http://people.valinux.co.jp/~iwamoto/mh.html
> >
> > Main changes against previous versions are:
> > - The stability is greatly improved. Kernel crashes (probably related
> > with kswapd) still happen, but they are rather rare so that I'm
> > having difficulty reproducing crashes.
> > Page remapping under simultaneous tar + rm -rf works.
> > - Implemented a solution to a deadlock caused by ext2_rename, which
> > increments a refcount of a directory page twice.
> >
> > Questions and comments are welcome.
>
> I really think that doing this over zones and pgdats isn't the best approach.
> You're going to make memory allocation and reclaim vastly less efficient,
> and you're exposing a bunch of very specialised code inside the main
> memory paths.

I used the discontigmem code because this is what we have now.
My hacks such as zone_active[] will go away when the memory hot add
code (on which Goto-san is working on) is ready.

> Have you looked at Daniel's CONFIG_NONLINEAR stuff? That provides a much
> cleaner abstraction for getting rid of discontiguous memory in the non
> truly-NUMA case, and should work really well for doing mem hot add / remove
> as well.

Thanks for pointing out. I looked at the patch.
It should be doable to make my patch work with the CONFIG_NONLINEAR
code. For my code to work, basically the following functionarities
are necessary:
1. disabling alloc_page from hot-removing area
and
2. enumerating pages in use in hot-removing area.

My target is somewhat NUMA-ish and fairly large. So I'm not sure if
CONFIG_NONLINEAR fits, but CONFIG_NUMA isn't perfect either.


> PS. What's this bit of the patch for?
>
> void *vmalloc(unsigned long size)
> {
> +#ifdef CONFIG_MEMHOTPLUGTEST
> + return __vmalloc(size, GFP_KERNEL, PAGE_KERNEL);
> +#else
> return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
> +#endif
> }

This is necessary because kernel memory cannot be swapped out.
Only highmem can be hot removed, though it doesn't need to be highmem.
We can define another zone attribute such as GFP_HOTPLUGGABLE.

--
IWAMOTO Toshihiro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/