Re: [PATCH v6 00/15] memory-hotplug: hot-remove physical memory

From: Kamezawa Hiroyuki
Date: Thu Jan 10 2013 - 03:24:05 EST


(2013/01/10 16:55), Glauber Costa wrote:
On 01/10/2013 11:31 AM, Kamezawa Hiroyuki wrote:
(2013/01/10 16:14), Glauber Costa wrote:
On 01/10/2013 06:17 AM, Tang Chen wrote:
Note: if the memory provided by the memory device is used by the
kernel, it
can't be offlined. It is not a bug.

Right. But how often does this happen in testing? In other words,
please provide an overall description of how well memory hot-remove is
presently operating. Is it reliable? What is the success rate in
real-world situations?

We test the hot-remove functionality mostly with movable_online used.
And the memory used by kernel is not allowed to be removed.

Can you try doing this using cpusets configured to hardwall ?
It is my understanding that the object allocators will try hard not to
allocate anything outside the walls defined by cpuset. Which means that
if you have one process per node, and they are hardwalled, your kernel
memory will be spread evenly among the machine. With a big enough load,
they should eventually be present in all blocks.


I'm sorry I couldn't catch your point.
Do you want to confirm whether cpuset can work enough instead of
ZONE_MOVABLE ?
Or Do you want to confirm whether ZONE_MOVABLE will not work if it's
used with cpuset ?


No, I am not proposing to use cpuset do tackle the problem. I am just
wondering if you would still have high success rates with cpusets in use
with hardwalls. This is just one example of a workload that would spread
kernel memory around quite heavily.

So this is just me trying to understand the limitations of the mechanism.


Hm, okay. In my undestanding, if the whole memory of a node is configured as
MOVABLE, no kernel memory will not be allocated in the node because zonelist
will not match. So, if cpuset is used with hardwalls, user will see -ENOMEM or OOM,
I guess. even fork() will fail if fallback-to-other-node is not allowed.

If it's configure as ZONE_NORMAL, you need to pray for offlining memory.

AFAIK, IBM's ppc? has 16MB section size. So, some of sections can be offlined
even if they are configured as ZONE_NORMAL. For them, placement of offlined
memory is not important because it's virtualized by LPAR, they don't try
to remove DIMM, they just want to increase/decrease amount of memory.
It's an another approach.

But here, we(fujitsu) tries to remove a system board/DIMM.
So, configuring the whole memory of a node as ZONE_MOVABLE and tries to guarantee
DIMM as removable.

IMHO, I don't think shrink_slab() can kill all objects in a node even
if they are some caches. We need more study for doing that.


Indeed, shrink_slab can only kill cached objects. They, however, are
usually a very big part of kernel memory. I wonder though if in case of
failure, it is worth it to try at least one shrink pass before you give up.


Yeah, now, his (our) approach is never allowing kernel memory on a node to be
hot-removed by ZONE_MOVABLE. So, shrink_slab()'s effect will not be seen.

If other brave guys tries to use ZONE_NORMAL for hot-pluggable DIMM, I see,
it's worth triying.

How about checking the target memsection is in NORMAL or in MOVABLE at
hot-removing ? If NORMAL, shrink_slab() will be worth to be called.

BTW, shrink_slab() is now node/zone aware ? If not, fixing that first will
be better direction I guess.

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/