Re: [PATCH 0/6] mm: make movable onlining suck less

From: Reza Arbab
Date: Wed Apr 05 2017 - 10:53:39 EST


On Wed, Apr 05, 2017 at 11:24:27AM +0200, Michal Hocko wrote:
On Wed 05-04-17 08:42:39, Michal Hocko wrote:
On Tue 04-04-17 16:43:39, Reza Arbab wrote:
> It's new. Without this patchset, I can repeatedly
> add_memory()->online_movable->offline->remove_memory() all of a node's
> memory.

This is quite unexpected because the code obviously cannot handle the
first memory section. Could you paste /proc/zoneinfo and
grep . -r /sys/devices/system/memory/auto_online_blocks/memory*, after
onlining for both patched and unpatched kernels?

Btw. how do you test this? I am really surprised you managed to
hotremove such a low pfn range.

When I boot, I have node 0 (4GB) and node 1 (empty):

Early memory node ranges
node 0: [mem 0x0000000000000000-0x00000000ffffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x00000000ffffffff]
On node 0 totalpages: 65536
DMA zone: 64 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 65536 pages, LIFO batch:1
Could not find start_pfn for node 1
Initmem setup node 1 [mem 0x0000000000000000-0x0000000000000000]
On node 1 totalpages: 0

My steps from there:

1. add_memory(1, 0x100000000, 0x100000000)
2. echo online_movable > /sys/devices/system/node/node1/memory[511..256]
3. echo offline > /sys/devices/system/node/node1/memory[256..511]
4. remove_memory(1, 0x100000000, 0x100000000)

After step 2, regardless of kernel:

$ cat /proc/zoneinfo
Node 0, zone DMA
per-node stats
nr_inactive_anon 418
nr_active_anon 2710
nr_inactive_file 4895
nr_active_file 1945
nr_unevictable 0
nr_isolated_anon 0
nr_isolated_file 0
nr_pages_scanned 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_pages 2654
nr_mapped 739
nr_file_pages 7314
nr_dirty 1
nr_writeback 0
nr_writeback_temp 0
nr_shmem 474
nr_shmem_hugepages 0
nr_shmem_pmdmapped 0
nr_anon_transparent_hugepages 0
nr_unstable 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 3259
nr_written 460
pages free 53520
min 63
low 128
high 193
node_scanned 0
spanned 65536
present 65536
managed 65218
nr_free_pages 53520
nr_zone_inactive_anon 418
nr_zone_active_anon 2710
nr_zone_inactive_file 4895
nr_zone_active_file 1945
nr_zone_unevictable 0
nr_zone_write_pending 1
nr_mlock 0
nr_slab_reclaimable 438
nr_slab_unreclaimable 808
nr_page_table_pages 32
nr_kernel_stack 2080
nr_bounce 0
numa_hit 313226
numa_miss 0
numa_foreign 0
numa_interleave 3071
numa_local 313226
numa_other 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 2
high: 6
batch: 1
vm stats threshold: 12
node_unreclaimable: 0
start_pfn: 0
node_inactive_ratio: 0
Node 1, zone Movable
per-node stats
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_isolated_anon 0
nr_isolated_file 0
nr_pages_scanned 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_writeback_temp 0
nr_shmem 0
nr_shmem_hugepages 0
nr_shmem_pmdmapped 0
nr_anon_transparent_hugepages 0
nr_unstable 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 0
nr_written 0
pages free 65536
min 63
low 128
high 193
node_scanned 0
spanned 65536
present 65536
managed 65536
nr_free_pages 65536
nr_zone_inactive_anon 0
nr_zone_active_anon 0
nr_zone_inactive_file 0
nr_zone_active_file 0
nr_zone_unevictable 0
nr_zone_write_pending 0
nr_mlock 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_page_table_pages 0
nr_kernel_stack 0
nr_bounce 0
numa_hit 0
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 0
numa_other 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 0
high: 6
batch: 1
vm stats threshold: 14
node_unreclaimable: 1
start_pfn: 65536
node_inactive_ratio: 0

After step 2, on v4.11-rc5:

$ grep . /sys/devices/system/memory/memory*/valid_zones
/sys/devices/system/memory/memory[0..254]/valid_zones:DMA
/sys/devices/system/memory/memory255/valid_zones:DMA Normal Movable
/sys/devices/system/memory/memory256/valid_zones:Movable Normal
/sys/devices/system/memory/memory[257..511]/valid_zones:Movable

After step 2, on v4.11-rc5 + all the patches from this thread:

$ grep . /sys/devices/system/memory/memory*/valid_zones
/sys/devices/system/memory/memory[0..255]/valid_zones:DMA
/sys/devices/system/memory/memory[256..511]/valid_zones:Movable

On v4.11-rc5, I can do steps 1-4 ad nauseam.
On v4.11-rc5 + all the patches from this thread, I can do things repeatedly, but starting on the second iteration, all the

/sys/devices/system/node/node1/memory*

symlinks are not created. I can still proceed using the actual files,

/sys/devices/system/memory/memory[256..511]

instead. I think it may be because step 4 does node_set_offline(1). That is, the node is not only emptied of memory, it is offlined completely.

I hope this made sense. :/

--
Reza Arbab