Re: [PATCH v1 part1 0/9] Introduce movablemem_map boot option.

From: Tang Chen
Date: Mon Mar 18 2013 - 07:11:59 EST

Hi Will,

On 03/17/2013 08:25 AM, Will Huck wrote:

It seems that Mel don't like this idea.

Thank you for reminding me this.

And yes, I have read that email. :)

And about this boot option, we have had a long discussion before.
Please refer to:

The situation is:

For now, Linux kernel cannot migrate kernel direct mapping memory. And
there is no way to ensure that ZONE_NORMAL has no kernel memory. So we
can only use ZONE_MOVABLE to ensure the memory device could be removed.

For now, I have the following reasons that movablemem_map boot option is
necessary. Some may be mentioned before, but here, I think I need to say
them again:

1) If we want to hot-remove a memory device, the device should only have
memory of two types:
- kernel memory whose life cycle is the same as the memory device.
such as pagetables, vmemmap
- user memory that could be migrated.

For type1: we can allocate it on local node, just like Yinghai's work,
and free it when hot-removing.
For type2: we can migrate it at run time. But it must be in ZONE_MOVABLE
because we cannot ensure ZONE_NORMAL has no kernel memory.

So we need a way to limit hotpluggable memory in ZONE_MOVABLE.

2) We have the following ways to do it:
a) use SRAT, which I have already implemented
b) specify physical address ranges, which I have implemented too, but
obviously very few guys like it.
c) specify node id. But nid could be changed on some platform by firmware.

Because of c), we chose to use physical address ranges. To satisfy all
users, I also implemented a).

3) Even if we don't specify physical address in command line, we use SRAT,
we still need the logic in this patch-set to achieve the same goal.

4) Since setting a whole node as movable will cause NUMA performance down,
no matter which way we use, we always need an interface to open or close
this functionality.
The boot option itself is an interface. If users don't specify it in
command line, the kernel will work as before.

So I do want to try again to push this boot option. :)

With this boot option, memory hotplug will work now.

It's true that if we reimplement the whole mm in Linux to make kernel
memory migratable, but we need to handle a lot of problems. I agree with Mel.
But it is a long way to go in the future.

And the work in the near future:
1) Allocate pagetables and vmemmap on local node, as Yinghai said.
2) Do the proper modification for hot-add and hot-remove.
- Reserve memory for pagetables and vmemmap when hot-add, maybe use
- Free all pagetables and vmemmap before hot-remove.
3) And about Mel's advice, modify memory management in Linux to migrate
kernel pages, it is a long way to go in the future. I think we can
discuss more.

Thanks. :)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at