Re: printk() + memory offline deadlock (WAS Re: page_alloc.shuffle=1 + CONFIG_PROVE_LOCKING=y = arm64 hang)
From: Sergey Senozhatsky
Date: Mon Sep 23 2019 - 06:21:08 EST
On (09/18/19 12:10), Qian Cai wrote:
[..]
> > So you have debug objects enabled. Right? This thing does not behave
> > when it comes to printing. debug_objects are slightly problematic.
>
> Yes, but there is an also a similar splat without the debug_objects. It looks
> like anything try to allocate memory in that path will trigger it anyway.
Appears to be different, yet somehow very familiar.
> [ 297.425908] WARNING: possible circular locking dependency detected
> [ 297.425908] 5.3.0-next-20190917 #8 Not tainted
> [ 297.425909] ------------------------------------------------------
> [ 297.425910] test.sh/8653 is trying to acquire lock:
> [ 297.425911] ffffffff865a4460 (console_owner){-.-.}, at:
> console_unlock+0x207/0x750
>
> [ 297.425914] but task is already holding lock:
> [ 297.425915] ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at:
> __offline_isolated_pages+0x179/0x3e0
>
> [ 297.425919] which lock already depends on the new lock.
>
>
> [ 297.425920] the existing dependency chain (in reverse order) is:
>
> [ 297.425922] -> #3 (&(&zone->lock)->rlock){-.-.}:
> [ 297.425925] __lock_acquire+0x5b3/0xb40
> [ 297.425925] lock_acquire+0x126/0x280
> [ 297.425926] _raw_spin_lock+0x2f/0x40
> [ 297.425927] rmqueue_bulk.constprop.21+0xb6/0x1160
> [ 297.425928] get_page_from_freelist+0x898/0x22c0
> [ 297.425928] __alloc_pages_nodemask+0x2f3/0x1cd0
> [ 297.425929] alloc_pages_current+0x9c/0x110
> [ 297.425930] allocate_slab+0x4c6/0x19c0
> [ 297.425931] new_slab+0x46/0x70
> [ 297.425931] ___slab_alloc+0x58b/0x960
> [ 297.425932] __slab_alloc+0x43/0x70
> [ 297.425933] __kmalloc+0x3ad/0x4b0
> [ 297.425933] __tty_buffer_request_room+0x100/0x250
> [ 297.425934] tty_insert_flip_string_fixed_flag+0x67/0x110
> [ 297.425935] pty_write+0xa2/0xf0
> [ 297.425936] n_tty_write+0x36b/0x7b0
> [ 297.425936] tty_write+0x284/0x4c0
> [ 297.425937] __vfs_write+0x50/0xa0
> [ 297.425938] vfs_write+0x105/0x290
> [ 297.425939] redirected_tty_write+0x6a/0xc0
> [ 297.425939] do_iter_write+0x248/0x2a0
> [ 297.425940] vfs_writev+0x106/0x1e0
> [ 297.425941] do_writev+0xd4/0x180
> [ 297.425941] __x64_sys_writev+0x45/0x50
> [ 297.425942] do_syscall_64+0xcc/0x76c
> [ 297.425943] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> [ 297.425944] -> #2 (&(&port->lock)->rlock){-.-.}:
> [ 297.425946] __lock_acquire+0x5b3/0xb40
> [ 297.425947] lock_acquire+0x126/0x280
> [ 297.425948] _raw_spin_lock_irqsave+0x3a/0x50
> [ 297.425949] tty_port_tty_get+0x20/0x60
> [ 297.425949] tty_port_default_wakeup+0xf/0x30
> [ 297.425950] tty_port_tty_wakeup+0x39/0x40
> [ 297.425951] uart_write_wakeup+0x2a/0x40
> [ 297.425952] serial8250_tx_chars+0x22e/0x440
> [ 297.425952] serial8250_handle_irq.part.8+0x14a/0x170
> [ 297.425953] serial8250_default_handle_irq+0x5c/0x90
> [ 297.425954] serial8250_interrupt+0xa6/0x130
> [ 297.425955] __handle_irq_event_percpu+0x78/0x4f0
> [ 297.425955] handle_irq_event_percpu+0x70/0x100
> [ 297.425956] handle_irq_event+0x5a/0x8b
> [ 297.425957] handle_edge_irq+0x117/0x370
> [ 297.425958] do_IRQ+0x9e/0x1e0
> [ 297.425958] ret_from_intr+0x0/0x2a
> [ 297.425959] cpuidle_enter_state+0x156/0x8e0
> [ 297.425960] cpuidle_enter+0x41/0x70
> [ 297.425960] call_cpuidle+0x5e/0x90
> [ 297.425961] do_idle+0x333/0x370
> [ 297.425962] cpu_startup_entry+0x1d/0x1f
> [ 297.425962] start_secondary+0x290/0x330
> [ 297.425963] secondary_startup_64+0xb6/0xc0
>
> [ 297.425964] -> #1 (&port_lock_key){-.-.}:
> [ 297.425967] __lock_acquire+0x5b3/0xb40
> [ 297.425967] lock_acquire+0x126/0x280
> [ 297.425968] _raw_spin_lock_irqsave+0x3a/0x50
> [ 297.425969] serial8250_console_write+0x3e4/0x450
> [ 297.425970] univ8250_console_write+0x4b/0x60
> [ 297.425970] console_unlock+0x501/0x750
> [ 297.425971] vprintk_emit+0x10d/0x340
> [ 297.425972] vprintk_default+0x1f/0x30
> [ 297.425972] vprintk_func+0x44/0xd4
> [ 297.425973] printk+0x9f/0xc5
> [ 297.425974] register_console+0x39c/0x520
> [ 297.425975] univ8250_console_init+0x23/0x2d
> [ 297.425975] console_init+0x338/0x4cd
> [ 297.425976] start_kernel+0x534/0x724
> [ 297.425977] x86_64_start_reservations+0x24/0x26
> [ 297.425977] x86_64_start_kernel+0xf4/0xfb
> [ 297.425978] secondary_startup_64+0xb6/0xc0
>
> [ 297.425979] -> #0 (console_owner){-.-.}:
> [ 297.425982] check_prev_add+0x107/0xea0
> [ 297.425982] validate_chain+0x8fc/0x1200
> [ 297.425983] __lock_acquire+0x5b3/0xb40
> [ 297.425984] lock_acquire+0x126/0x280
> [ 297.425984] console_unlock+0x269/0x750
> [ 297.425985] vprintk_emit+0x10d/0x340
> [ 297.425986] vprintk_default+0x1f/0x30
> [ 297.425987] vprintk_func+0x44/0xd4
> [ 297.425987] printk+0x9f/0xc5
> [ 297.425988] __offline_isolated_pages.cold.52+0x2f/0x30a
> [ 297.425989] offline_isolated_pages_cb+0x17/0x30
> [ 297.425990] walk_system_ram_range+0xda/0x160
> [ 297.425990] __offline_pages+0x79c/0xa10
> [ 297.425991] offline_pages+0x11/0x20
> [ 297.425992] memory_subsys_offline+0x7e/0xc0
> [ 297.425992] device_offline+0xd5/0x110
> [ 297.425993] state_store+0xc6/0xe0
> [ 297.425994] dev_attr_store+0x3f/0x60
> [ 297.425995] sysfs_kf_write+0x89/0xb0
> [ 297.425995] kernfs_fop_write+0x188/0x240
> [ 297.425996] __vfs_write+0x50/0xa0
> [ 297.425997] vfs_write+0x105/0x290
> [ 297.425997] ksys_write+0xc6/0x160
> [ 297.425998] __x64_sys_write+0x43/0x50
> [ 297.425999] do_syscall_64+0xcc/0x76c
> [ 297.426000] entry_SYSCALL_64_after_hwframe+0x49/0xbe
I suppose you run with CONFIG_DEBUG_VM...
So we have
port->lock -> MM -> zone->lock
// from pty_write()->__tty_buffer_request_room()->kmalloc()
vs
zone->lock -> printk() -> port->lock
// from __offline_pages()->__offline_isolated_pages()->printk()
A number of debugging options make the kernel less stable.
Sad but true.
-ss