Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
From: Michal Hocko
Date: Thu Aug 29 2019 - 11:39:42 EST
On Mon 26-08-19 12:10:09, David Hildenbrand wrote:
> It is easier than I though to trigger a kernel bug by removing memory that
> was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
> garbage, resulting in the detection of a broken zone when removing memory.
> Without CONFIG_DEBUG_VM it is less likely - but we could still have
> garbage in the memmap.
>
> :/# [ 23.912993] BUG: unable to handle page fault for address: 000000000000353d
> [ 23.914219] #PF: supervisor write access in kernel mode
> [ 23.915199] #PF: error_code(0x0002) - not-present page
> [ 23.916160] PGD 0 P4D 0
> [ 23.916627] Oops: 0002 [#1] SMP PTI
> [ 23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
> [ 23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
> [ 23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [ 23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
> [ 23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
> [ 23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
> [ 23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
> [ 23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
> [ 23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
> [ 23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
> [ 23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
> [ 23.934913] FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
> [ 23.936294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
> [ 23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 23.941168] Call Trace:
> [ 23.941580] __remove_pages+0x4b/0x640
> [ 23.942303] ? mark_held_locks+0x49/0x70
> [ 23.943149] arch_remove_memory+0x63/0x8d
> [ 23.943921] try_remove_memory+0xdb/0x130
> [ 23.944766] ? walk_memory_blocks+0x7f/0x9e
> [ 23.945616] __remove_memory+0xa/0x11
> [ 23.946274] acpi_memory_device_remove+0x70/0x100
> [ 23.947308] acpi_bus_trim+0x55/0x90
> [ 23.947914] acpi_device_hotplug+0x227/0x3a0
> [ 23.948714] acpi_hotplug_work_fn+0x1a/0x30
> [ 23.949433] process_one_work+0x221/0x550
> [ 23.950190] worker_thread+0x50/0x3b0
> [ 23.950993] kthread+0x105/0x140
> [ 23.951644] ? process_one_work+0x550/0x550
> [ 23.952508] ? kthread_park+0x80/0x80
> [ 23.953367] ret_from_fork+0x3a/0x50
> [ 23.954025] Modules linked in:
> [ 23.954613] CR2: 000000000000353d
> [ 23.955248] ---[ end trace 93d982b1fb3e1a69 ]---
Yes, this is indeed nasty. I didin't think of this when separating
memmap initialization from the hotremove. This means that the zone
pointer is a garbage in arch_remove_memory already. The proper fix is to
remove it from that level down. Moreover the zone is only needed for the
shrinking code and zone continuous thingy. The later belongs to offlining
code unless I am missing something. I can see that you are removing zone
parameter in a later patch but wouldn't it be just better to remove the
whole zone thing in a single patch and have this as a bug fix for a rare
bug with a fixes tag?
--
Michal Hocko
SUSE Labs