Re: printk() + memory offline deadlock (WAS Re: page_alloc.shuffle=1 + CONFIG_PROVE_LOCKING=y = arm64 hang)

From: Qian Cai
Date: Wed Sep 18 2019 - 12:10:15 EST


On Thu, 2019-09-19 at 00:50 +0900, Sergey Senozhatsky wrote:
> On (09/18/19 10:39), Qian Cai wrote:
> > > Perhaps for a quick fix (and a comment that says this needs to be fixed
> > > properly). I think the changes to printk() that was discussed at
> > > Plumbers may also solve this properly.
> >
> > I assume that the new printk() stuff will also fix this deadlock between
> > printk() and memory offline.
>
> Mother chicken...
>
> Do you actually see a deadlock? I'd rather expect a lockdep splat, but
> anyway...

Not yet, just a lockdep splat so far.

>
> > [ÂÂ317.337595] WARNING: possible circular locking dependency detected
> > [ÂÂ317.337596] 5.3.0-next-20190917+ #9 Not tainted
> > [ÂÂ317.337597] ------------------------------------------------------
> > [ÂÂ317.337597] test.sh/8738 is trying to acquire lock:
> > [ÂÂ317.337598] ffffffffb33a4978 ((console_sem).lock){-.-.}, at:> down_trylock+0x16/0x50
> >
> > [ÂÂ317.337602] but task is already holding lock:
> > [ÂÂ317.337602] ffff88883fff4318 (&(&zone->lock)->rlock){-.-.}, at:> start_isolate_page_range+0x1f7/0x570
> >
> > [ÂÂ317.337606] which lock already depends on the new lock.
> >
> > [ÂÂ317.337608] the existing dependency chain (in reverse order) is:
> >
> > [ÂÂ317.337609] -> #3 (&(&zone->lock)->rlock){-.-.}:
> > [ÂÂ317.337612]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
> > [ÂÂ317.337613]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
> > [ÂÂ317.337613]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x2f/0x40
> > [ÂÂ317.337614]ÂÂÂÂÂÂÂÂrmqueue_bulk.constprop.21+0xb6/0x1160
> > [ÂÂ317.337615]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x898/0x22c0
> > [ÂÂ317.337616]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x2f3/0x1cd0
> > [ÂÂ317.337617]ÂÂÂÂÂÂÂÂalloc_page_interleave+0x18/0x130
> > [ÂÂ317.337618]ÂÂÂÂÂÂÂÂalloc_pages_current+0xf6/0x110
> > [ÂÂ317.337619]ÂÂÂÂÂÂÂÂallocate_slab+0x4c6/0x19c0
> > [ÂÂ317.337620]ÂÂÂÂÂÂÂÂnew_slab+0x46/0x70
> > [ÂÂ317.337621]ÂÂÂÂÂÂÂÂ___slab_alloc+0x58b/0x960
> > [ÂÂ317.337621]ÂÂÂÂÂÂÂÂ__slab_alloc+0x43/0x70
> > [ÂÂ317.337622]ÂÂÂÂÂÂÂÂkmem_cache_alloc+0x354/0x460
> > [ÂÂ317.337623]ÂÂÂÂÂÂÂÂfill_pool+0x272/0x4b0
> > [ÂÂ317.337624]ÂÂÂÂÂÂÂÂ__debug_object_init+0x86/0x790
> > [ÂÂ317.337624]ÂÂÂÂÂÂÂÂdebug_object_init+0x16/0x20
> > [ÂÂ317.337625]ÂÂÂÂÂÂÂÂhrtimer_init+0x27/0x1e0
> > [ÂÂ317.337626]ÂÂÂÂÂÂÂÂinit_dl_task_timer+0x20/0x40
> > [ÂÂ317.337627]ÂÂÂÂÂÂÂÂ__sched_fork+0x10b/0x1f0
> > [ÂÂ317.337627]ÂÂÂÂÂÂÂÂinit_idle+0xac/0x520
> > [ÂÂ317.337628]ÂÂÂÂÂÂÂÂidle_thread_get+0x7c/0xc0
> > [ÂÂ317.337629]ÂÂÂÂÂÂÂÂbringup_cpu+0x1a/0x1e0
> > [ÂÂ317.337630]ÂÂÂÂÂÂÂÂcpuhp_invoke_callback+0x197/0x1120
> > [ÂÂ317.337630]ÂÂÂÂÂÂÂÂ_cpu_up+0x171/0x280
> > [ÂÂ317.337631]ÂÂÂÂÂÂÂÂdo_cpu_up+0xb1/0x120
> > [ÂÂ317.337632]ÂÂÂÂÂÂÂÂcpu_up+0x13/0x20
> > [ÂÂ317.337632]ÂÂÂÂÂÂÂÂsmp_init+0xa4/0x12d
> > [ÂÂ317.337633]ÂÂÂÂÂÂÂÂkernel_init_freeable+0x37e/0x76e
> > [ÂÂ317.337634]ÂÂÂÂÂÂÂÂkernel_init+0x11/0x12f
> > [ÂÂ317.337635]ÂÂÂÂÂÂÂÂret_from_fork+0x3a/0x50
>
> So you have debug objects enabled. Right? This thing does not behave
> when it comes to printing. debug_objects are slightly problematic.

Yes, but there is an also a similar splat without the debug_objects. It looks
like anything try to allocate memory in that path will trigger it anyway.

[ÂÂ297.425908] WARNING: possible circular locking dependency detected
[ÂÂ297.425908] 5.3.0-next-20190917 #8 Not tainted
[ÂÂ297.425909] ------------------------------------------------------
[ÂÂ297.425910] test.sh/8653 is trying to acquire lock:
[ÂÂ297.425911] ffffffff865a4460 (console_owner){-.-.}, at:
console_unlock+0x207/0x750

[ÂÂ297.425914] but task is already holding lock:
[ÂÂ297.425915] ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at:
__offline_isolated_pages+0x179/0x3e0

[ÂÂ297.425919] which lock already depends on the new lock.


[ÂÂ297.425920] the existing dependency chain (in reverse order) is:

[ÂÂ297.425922] -> #3 (&(&zone->lock)->rlock){-.-.}:
[ÂÂ297.425925]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
[ÂÂ297.425925]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
[ÂÂ297.425926]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x2f/0x40
[ÂÂ297.425927]ÂÂÂÂÂÂÂÂrmqueue_bulk.constprop.21+0xb6/0x1160
[ÂÂ297.425928]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x898/0x22c0
[ÂÂ297.425928]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x2f3/0x1cd0
[ÂÂ297.425929]ÂÂÂÂÂÂÂÂalloc_pages_current+0x9c/0x110
[ÂÂ297.425930]ÂÂÂÂÂÂÂÂallocate_slab+0x4c6/0x19c0
[ÂÂ297.425931]ÂÂÂÂÂÂÂÂnew_slab+0x46/0x70
[ÂÂ297.425931]ÂÂÂÂÂÂÂÂ___slab_alloc+0x58b/0x960
[ÂÂ297.425932]ÂÂÂÂÂÂÂÂ__slab_alloc+0x43/0x70
[ÂÂ297.425933]ÂÂÂÂÂÂÂÂ__kmalloc+0x3ad/0x4b0
[ÂÂ297.425933]ÂÂÂÂÂÂÂÂ__tty_buffer_request_room+0x100/0x250
[ÂÂ297.425934]ÂÂÂÂÂÂÂÂtty_insert_flip_string_fixed_flag+0x67/0x110
[ÂÂ297.425935]ÂÂÂÂÂÂÂÂpty_write+0xa2/0xf0
[ÂÂ297.425936]ÂÂÂÂÂÂÂÂn_tty_write+0x36b/0x7b0
[ÂÂ297.425936]ÂÂÂÂÂÂÂÂtty_write+0x284/0x4c0
[ÂÂ297.425937]ÂÂÂÂÂÂÂÂ__vfs_write+0x50/0xa0
[ÂÂ297.425938]ÂÂÂÂÂÂÂÂvfs_write+0x105/0x290
[ÂÂ297.425939]ÂÂÂÂÂÂÂÂredirected_tty_write+0x6a/0xc0
[ÂÂ297.425939]ÂÂÂÂÂÂÂÂdo_iter_write+0x248/0x2a0
[ÂÂ297.425940]ÂÂÂÂÂÂÂÂvfs_writev+0x106/0x1e0
[ÂÂ297.425941]ÂÂÂÂÂÂÂÂdo_writev+0xd4/0x180
[ÂÂ297.425941]ÂÂÂÂÂÂÂÂ__x64_sys_writev+0x45/0x50
[ÂÂ297.425942]ÂÂÂÂÂÂÂÂdo_syscall_64+0xcc/0x76c
[ÂÂ297.425943]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe

[ÂÂ297.425944] -> #2 (&(&port->lock)->rlock){-.-.}:
[ÂÂ297.425946]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
[ÂÂ297.425947]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
[ÂÂ297.425948]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
[ÂÂ297.425949]ÂÂÂÂÂÂÂÂtty_port_tty_get+0x20/0x60
[ÂÂ297.425949]ÂÂÂÂÂÂÂÂtty_port_default_wakeup+0xf/0x30
[ÂÂ297.425950]ÂÂÂÂÂÂÂÂtty_port_tty_wakeup+0x39/0x40
[ÂÂ297.425951]ÂÂÂÂÂÂÂÂuart_write_wakeup+0x2a/0x40
[ÂÂ297.425952]ÂÂÂÂÂÂÂÂserial8250_tx_chars+0x22e/0x440
[ÂÂ297.425952]ÂÂÂÂÂÂÂÂserial8250_handle_irq.part.8+0x14a/0x170
[ÂÂ297.425953]ÂÂÂÂÂÂÂÂserial8250_default_handle_irq+0x5c/0x90
[ÂÂ297.425954]ÂÂÂÂÂÂÂÂserial8250_interrupt+0xa6/0x130
[ÂÂ297.425955]ÂÂÂÂÂÂÂÂ__handle_irq_event_percpu+0x78/0x4f0
[ÂÂ297.425955]ÂÂÂÂÂÂÂÂhandle_irq_event_percpu+0x70/0x100
[ÂÂ297.425956]ÂÂÂÂÂÂÂÂhandle_irq_event+0x5a/0x8b
[ÂÂ297.425957]ÂÂÂÂÂÂÂÂhandle_edge_irq+0x117/0x370
[ÂÂ297.425958]ÂÂÂÂÂÂÂÂdo_IRQ+0x9e/0x1e0
[ÂÂ297.425958]ÂÂÂÂÂÂÂÂret_from_intr+0x0/0x2a
[ÂÂ297.425959]ÂÂÂÂÂÂÂÂcpuidle_enter_state+0x156/0x8e0
[ÂÂ297.425960]ÂÂÂÂÂÂÂÂcpuidle_enter+0x41/0x70
[ÂÂ297.425960]ÂÂÂÂÂÂÂÂcall_cpuidle+0x5e/0x90
[ÂÂ297.425961]ÂÂÂÂÂÂÂÂdo_idle+0x333/0x370
[ÂÂ297.425962]ÂÂÂÂÂÂÂÂcpu_startup_entry+0x1d/0x1f
[ÂÂ297.425962]ÂÂÂÂÂÂÂÂstart_secondary+0x290/0x330
[ÂÂ297.425963]ÂÂÂÂÂÂÂÂsecondary_startup_64+0xb6/0xc0

[ÂÂ297.425964] -> #1 (&port_lock_key){-.-.}:
[ÂÂ297.425967]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
[ÂÂ297.425967]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
[ÂÂ297.425968]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
[ÂÂ297.425969]ÂÂÂÂÂÂÂÂserial8250_console_write+0x3e4/0x450
[ÂÂ297.425970]ÂÂÂÂÂÂÂÂuniv8250_console_write+0x4b/0x60
[ÂÂ297.425970]ÂÂÂÂÂÂÂÂconsole_unlock+0x501/0x750
[ÂÂ297.425971]ÂÂÂÂÂÂÂÂvprintk_emit+0x10d/0x340
[ÂÂ297.425972]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
[ÂÂ297.425972]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
[ÂÂ297.425973]ÂÂÂÂÂÂÂÂprintk+0x9f/0xc5
[ÂÂ297.425974]ÂÂÂÂÂÂÂÂregister_console+0x39c/0x520
[ÂÂ297.425975]ÂÂÂÂÂÂÂÂuniv8250_console_init+0x23/0x2d
[ÂÂ297.425975]ÂÂÂÂÂÂÂÂconsole_init+0x338/0x4cd
[ÂÂ297.425976]ÂÂÂÂÂÂÂÂstart_kernel+0x534/0x724
[ÂÂ297.425977]ÂÂÂÂÂÂÂÂx86_64_start_reservations+0x24/0x26
[ÂÂ297.425977]ÂÂÂÂÂÂÂÂx86_64_start_kernel+0xf4/0xfb
[ÂÂ297.425978]ÂÂÂÂÂÂÂÂsecondary_startup_64+0xb6/0xc0

[ÂÂ297.425979] -> #0 (console_owner){-.-.}:
[ÂÂ297.425982]ÂÂÂÂÂÂÂÂcheck_prev_add+0x107/0xea0
[ÂÂ297.425982]ÂÂÂÂÂÂÂÂvalidate_chain+0x8fc/0x1200
[ÂÂ297.425983]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
[ÂÂ297.425984]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
[ÂÂ297.425984]ÂÂÂÂÂÂÂÂconsole_unlock+0x269/0x750
[ÂÂ297.425985]ÂÂÂÂÂÂÂÂvprintk_emit+0x10d/0x340
[ÂÂ297.425986]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
[ÂÂ297.425987]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
[ÂÂ297.425987]ÂÂÂÂÂÂÂÂprintk+0x9f/0xc5
[ÂÂ297.425988]ÂÂÂÂÂÂÂÂ__offline_isolated_pages.cold.52+0x2f/0x30a
[ÂÂ297.425989]ÂÂÂÂÂÂÂÂoffline_isolated_pages_cb+0x17/0x30
[ÂÂ297.425990]ÂÂÂÂÂÂÂÂwalk_system_ram_range+0xda/0x160
[ÂÂ297.425990]ÂÂÂÂÂÂÂÂ__offline_pages+0x79c/0xa10
[ÂÂ297.425991]ÂÂÂÂÂÂÂÂoffline_pages+0x11/0x20
[ÂÂ297.425992]ÂÂÂÂÂÂÂÂmemory_subsys_offline+0x7e/0xc0
[ÂÂ297.425992]ÂÂÂÂÂÂÂÂdevice_offline+0xd5/0x110
[ÂÂ297.425993]ÂÂÂÂÂÂÂÂstate_store+0xc6/0xe0
[ÂÂ297.425994]ÂÂÂÂÂÂÂÂdev_attr_store+0x3f/0x60
[ÂÂ297.425995]ÂÂÂÂÂÂÂÂsysfs_kf_write+0x89/0xb0
[ÂÂ297.425995]ÂÂÂÂÂÂÂÂkernfs_fop_write+0x188/0x240
[ÂÂ297.425996]ÂÂÂÂÂÂÂÂ__vfs_write+0x50/0xa0
[ÂÂ297.425997]ÂÂÂÂÂÂÂÂvfs_write+0x105/0x290
[ÂÂ297.425997]ÂÂÂÂÂÂÂÂksys_write+0xc6/0x160
[ÂÂ297.425998]ÂÂÂÂÂÂÂÂ__x64_sys_write+0x43/0x50
[ÂÂ297.425999]ÂÂÂÂÂÂÂÂdo_syscall_64+0xcc/0x76c
[ÂÂ297.426000]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe

[ÂÂ297.426001] other info that might help us debug this:

[ÂÂ297.426002] Chain exists of:
[ÂÂ297.426002]ÂÂÂconsole_owner --> &(&port->lock)->rlock --> &(&zone->lock)-
>rlock

[ÂÂ297.426007]ÂÂPossible unsafe locking scenario:

[ÂÂ297.426008]ÂÂÂÂÂÂÂÂCPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂCPU1
[ÂÂ297.426009]ÂÂÂÂÂÂÂÂ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ----
[ÂÂ297.426009]ÂÂÂlock(&(&zone->lock)->rlock);
[ÂÂ297.426011]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&(&port->lock)->rlock);
[ÂÂ297.426013]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&(&zone->lock)->rlock);
[ÂÂ297.426014]ÂÂÂlock(console_owner);

[ÂÂ297.426016]ÂÂ*** DEADLOCK ***

[ÂÂ297.426017] 9 locks held by test.sh/8653:
[ÂÂ297.426018]ÂÂ#0: ffff88839ba7d408 (sb_writers#4){.+.+}, at:
vfs_write+0x25f/0x290
[ÂÂ297.426021]ÂÂ#1: ffff888277618880 (&of->mutex){+.+.}, at:
kernfs_fop_write+0x128/0x240
[ÂÂ297.426024]ÂÂ#2: ffff8898131fc218 (kn->count#115){.+.+}, at:
kernfs_fop_write+0x138/0x240
[ÂÂ297.426028]ÂÂ#3: ffffffff86962a80 (device_hotplug_lock){+.+.}, at:
lock_device_hotplug_sysfs+0x16/0x50
[ÂÂ297.426031]ÂÂ#4: ffff8884374f4990 (&dev->mutex){....}, at:
device_offline+0x70/0x110
[ÂÂ297.426034]ÂÂ#5: ffffffff86515250 (cpu_hotplug_lock.rw_sem){++++}, at:
__offline_pages+0xbf/0xa10
[ÂÂ297.426037]ÂÂ#6: ffffffff867405f0 (mem_hotplug_lock.rw_sem){++++}, at:
percpu_down_write+0x87/0x2f0
[ÂÂ297.426040]ÂÂ#7: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at:
__offline_isolated_pages+0x179/0x3e0
[ÂÂ297.426043]ÂÂ#8: ffffffff865a4920 (console_lock){+.+.}, at:
vprintk_emit+0x100/0x340

[ÂÂ297.426047] stack backtrace:
[ÂÂ297.426048] CPU: 1 PID: 8653 Comm: test.sh Not tainted 5.3.0-next-20190917 #8
[ÂÂ297.426049] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10,
BIOS U34 05/21/2019
[ÂÂ297.426049] Call Trace:
[ÂÂ297.426050]ÂÂdump_stack+0x86/0xca
[ÂÂ297.426051]ÂÂprint_circular_bug.cold.31+0x243/0x26e
[ÂÂ297.426051]ÂÂcheck_noncircular+0x29e/0x2e0
[ÂÂ297.426052]ÂÂ? stack_trace_save+0x87/0xb0
[ÂÂ297.426053]ÂÂ? print_circular_bug+0x120/0x120
[ÂÂ297.426053]ÂÂcheck_prev_add+0x107/0xea0
[ÂÂ297.426054]ÂÂvalidate_chain+0x8fc/0x1200
[ÂÂ297.426055]ÂÂ? check_prev_add+0xea0/0xea0
[ÂÂ297.426055]ÂÂ__lock_acquire+0x5b3/0xb40
[ÂÂ297.426056]ÂÂlock_acquire+0x126/0x280
[ÂÂ297.426057]ÂÂ? console_unlock+0x207/0x750
[ÂÂ297.426057]ÂÂ? __kasan_check_read+0x11/0x20
[ÂÂ297.426058]ÂÂconsole_unlock+0x269/0x750
[ÂÂ297.426059]ÂÂ? console_unlock+0x207/0x750
[ÂÂ297.426059]ÂÂvprintk_emit+0x10d/0x340
[ÂÂ297.426060]ÂÂvprintk_default+0x1f/0x30
[ÂÂ297.426061]ÂÂvprintk_func+0x44/0xd4
[ÂÂ297.426061]ÂÂ? do_raw_spin_lock+0x118/0x1d0
[ÂÂ297.426062]ÂÂprintk+0x9f/0xc5
[ÂÂ297.426063]ÂÂ? kmsg_dump_rewind_nolock+0x64/0x64
[ÂÂ297.426064]ÂÂ? __offline_isolated_pages+0x179/0x3e0
[ÂÂ297.426064]ÂÂ__offline_isolated_pages.cold.52+0x2f/0x30a
[ÂÂ297.426065]ÂÂ? online_memory_block+0x20/0x20
[ÂÂ297.426066]ÂÂoffline_isolated_pages_cb+0x17/0x30
[ÂÂ297.426067]ÂÂwalk_system_ram_range+0xda/0x160
[ÂÂ297.426067]ÂÂ? walk_mem_res+0x30/0x30
[ÂÂ297.426068]ÂÂ? dissolve_free_huge_page+0x1e/0x2b0
[ÂÂ297.426069]ÂÂ__offline_pages+0x79c/0xa10
[ÂÂ297.426069]ÂÂ? __add_memory+0xc0/0xc0
[ÂÂ297.426070]ÂÂ? __kasan_check_write+0x14/0x20
[ÂÂ297.426071]ÂÂ? __mutex_lock+0x344/0xcd0
[ÂÂ297.426071]ÂÂ? _raw_spin_unlock_irqrestore+0x49/0x50
[ÂÂ297.426072]ÂÂ? device_offline+0x70/0x110
[ÂÂ297.426073]ÂÂ? klist_next+0x1c1/0x1e0
[ÂÂ297.426073]ÂÂ? __mutex_add_waiter+0xc0/0xc0
[ÂÂ297.426074]ÂÂ? klist_next+0x10b/0x1e0
[ÂÂ297.426075]ÂÂ? klist_iter_exit+0x16/0x40
[ÂÂ297.426076]ÂÂ? device_for_each_child+0xd0/0x110
[ÂÂ297.426076]ÂÂoffline_pages+0x11/0x20
[ÂÂ297.426077]ÂÂmemory_subsys_offline+0x7e/0xc0
[ÂÂ297.426078]ÂÂdevice_offline+0xd5/0x110
[ÂÂ297.426078]ÂÂ? auto_online_blocks_show+0x70/0x70
[ÂÂ297.426079]ÂÂstate_store+0xc6/0xe0
[ÂÂ297.426080]ÂÂdev_attr_store+0x3f/0x60
[ÂÂ297.426080]ÂÂ? device_match_name+0x40/0x40
[ÂÂ297.426081]ÂÂsysfs_kf_write+0x89/0xb0
[ÂÂ297.426082]ÂÂ? sysfs_file_ops+0xa0/0xa0
[ÂÂ297.426082]ÂÂkernfs_fop_write+0x188/0x240
[ÂÂ297.426083]ÂÂ__vfs_write+0x50/0xa0
[ÂÂ297.426084]ÂÂvfs_write+0x105/0x290
[ÂÂ297.426084]ÂÂksys_write+0xc6/0x160
[ÂÂ297.426085]ÂÂ? __x64_sys_read+0x50/0x50
[ÂÂ297.426086]ÂÂ? do_syscall_64+0x79/0x76c
[ÂÂ297.426086]ÂÂ? do_syscall_64+0x79/0x76c
[ÂÂ297.426087]ÂÂ__x64_sys_write+0x43/0x50
[ÂÂ297.426088]ÂÂdo_syscall_64+0xcc/0x76c
[ÂÂ297.426088]ÂÂ? trace_hardirqs_on_thunk+0x1a/0x20
[ÂÂ297.426089]ÂÂ? syscall_return_slowpath+0x210/0x210
[ÂÂ297.426090]ÂÂ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
[ÂÂ297.426091]ÂÂ? trace_hardirqs_off_caller+0x3a/0x150
[ÂÂ297.426092]ÂÂ? trace_hardirqs_off_thunk+0x1a/0x20
[ÂÂ297.426092]ÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
[ÂÂ297.426093] RIP: 0033:0x7fd7336b4e18
[ÂÂ297.426095] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f
1e fa 48 8d 05 05 59 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0
ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[ÂÂ297.426096] RSP: 002b:00007ffc58c7b258 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[ÂÂ297.426098] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007fd7336b4e18
[ÂÂ297.426098] RDX: 0000000000000008 RSI: 000055ad6d519c70 RDI: 0000000000000001
[ÂÂ297.426099] RBP: 000055ad6d519c70 R08: 000000000000000a R09: 00007fd733746300
[ÂÂ297.426100] R10: 000000000000000a R11: 0000000000000246 R12: 00007fd733986780
[ÂÂ297.426101] R13: 0000000000000008 R14: 00007fd733981740 R15: 0000000000000008


[ÂÂ763.659202][ÂÂÂÂC6] WARNING: possible circular locking dependency detected
[ÂÂ763.659202][ÂÂÂÂC6] 5.3.0-next-20190917 #3 Not tainted
[ÂÂ763.659203][ÂÂÂÂC6] ------------------------------------------------------
[ÂÂ763.659203][ÂÂÂÂC6] test.sh/8352 is trying to acquire lock:
[ÂÂ763.659203][ÂÂÂÂC6] ffffffffa187e5f8 ((console_sem).lock){..-.}, at:
down_trylock+0x14/0x40
[ÂÂ763.659206][ÂÂÂÂC6]Â
[ÂÂ763.659206][ÂÂÂÂC6] but task is already holding lock:
[ÂÂ763.659206][ÂÂÂÂC6] ffff9bcf7f373c58 (&(&zone->lock)->rlock){-.-.}, at:
__offline_isolated_pages+0x11e/0x2d0
[ÂÂ763.659208][ÂÂÂÂC6]Â
[ÂÂ763.659208][ÂÂÂÂC6] which lock already depends on the new lock.
[ÂÂ763.659209][ÂÂÂÂC6]Â
[ÂÂ763.659209][ÂÂÂÂC6]Â
[ÂÂ763.659209][ÂÂÂÂC6] the existing dependency chain (in reverse order) is:
[ÂÂ763.659210][ÂÂÂÂC6]Â
[ÂÂ763.659210][ÂÂÂÂC6] -> #3 (&(&zone->lock)->rlock){-.-.}:
[ÂÂ763.659211][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__lock_acquire+0x44e/0x8c0
[ÂÂ763.659212][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂlock_acquire+0xc0/0x1c0
[ÂÂ763.659212][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x2f/0x40
[ÂÂ763.659212][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂrmqueue_bulk.constprop.24+0x62/0xba0
[ÂÂ763.659213][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x581/0x1810
[ÂÂ763.659213][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x20d/0x1750
[ÂÂ763.659214][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂalloc_page_interleave+0x17/0x100
[ÂÂ763.659214][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂalloc_pages_current+0xc0/0xe0
[ÂÂ763.659214][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂallocate_slab+0x4b2/0x1a20
[ÂÂ763.659215][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂnew_slab+0x46/0x70
[ÂÂ763.659215][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ___slab_alloc+0x58a/0x960
[ÂÂ763.659215][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__slab_alloc+0x43/0x70
[ÂÂ763.659216][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂkmem_cache_alloc+0x33e/0x440
[ÂÂ763.659216][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂfill_pool+0x1ae/0x460
[ÂÂ763.659216][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__debug_object_init+0x35/0x4a0
[ÂÂ763.659217][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdebug_object_init+0x16/0x20
[ÂÂ763.659217][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂhrtimer_init+0x25/0x130
[ÂÂ763.659218][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂinit_dl_task_timer+0x20/0x30
[ÂÂ763.659218][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__sched_fork+0x92/0x100
[ÂÂ763.659218][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂinit_idle+0x8d/0x380
[ÂÂ763.659219][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂfork_idle+0xd9/0x140
[ÂÂ763.659219][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂidle_threads_init+0xd3/0x15e
[ÂÂ763.659219][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂsmp_init+0x1b/0xbb
[ÂÂ763.659220][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂkernel_init_freeable+0x248/0x557
[ÂÂ763.659220][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂkernel_init+0xf/0x11e
[ÂÂ763.659220][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂret_from_fork+0x27/0x50
[ÂÂ763.659221][ÂÂÂÂC6]Â
[ÂÂ763.659221][ÂÂÂÂC6] -> #2 (&rq->lock){-.-.}:
[ÂÂ763.659222][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__lock_acquire+0x44e/0x8c0
[ÂÂ763.659223][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂlock_acquire+0xc0/0x1c0
[ÂÂ763.659223][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x2f/0x40
[ÂÂ763.659223][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂtask_fork_fair+0x37/0x150
[ÂÂ763.659224][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂsched_fork+0x126/0x230
[ÂÂ763.659224][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂcopy_process+0xafc/0x1e90
[ÂÂ763.659224][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ_do_fork+0x89/0x720
[ÂÂ763.659225][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂkernel_thread+0x58/0x70
[ÂÂ763.659225][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂrest_init+0x28/0x302
[ÂÂ763.659225][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂarch_call_rest_init+0xe/0x1b
[ÂÂ763.659226][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂstart_kernel+0x581/0x5a0
[ÂÂ763.659226][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂx86_64_start_reservations+0x24/0x26
[ÂÂ763.659227][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂx86_64_start_kernel+0xef/0xf6
[ÂÂ763.659227][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂsecondary_startup_64+0xb6/0xc0
[ÂÂ763.659227][ÂÂÂÂC6]Â
[ÂÂ763.659227][ÂÂÂÂC6] -> #1 (&p->pi_lock){-.-.}:
[ÂÂ763.659229][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__lock_acquire+0x44e/0x8c0
[ÂÂ763.659229][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂlock_acquire+0xc0/0x1c0
[ÂÂ763.659230][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
[ÂÂ763.659230][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂtry_to_wake_up+0x5c/0xbc0
[ÂÂ763.659230][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂwake_up_process+0x15/0x20
[ÂÂ763.659231][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__up+0x4a/0x50
[ÂÂ763.659231][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂup+0x45/0x50
[ÂÂ763.659231][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__up_console_sem+0x37/0x60
[ÂÂ763.659232][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂconsole_unlock+0x357/0x600
[ÂÂ763.659232][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_emit+0x101/0x320
[ÂÂ763.659232][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
[ÂÂ763.659233][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
[ÂÂ763.659233][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂprintk+0x58/0x6f
[ÂÂ763.659234][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdo_exit+0xd73/0xd80
[ÂÂ763.659234][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdo_group_exit+0x41/0xd0
[ÂÂ763.659234][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__x64_sys_exit_group+0x18/0x20
[ÂÂ763.659235][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdo_syscall_64+0x6d/0x488
[ÂÂ763.659235][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
[ÂÂ763.659235][ÂÂÂÂC6]Â
[ÂÂ763.659236][ÂÂÂÂC6] -> #0 ((console_sem).lock){..-.}:
[ÂÂ763.659237][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂcheck_prev_add+0x9b/0xa10
[ÂÂ763.659237][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvalidate_chain+0x759/0xdc0
[ÂÂ763.659238][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__lock_acquire+0x44e/0x8c0
[ÂÂ763.659238][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂlock_acquire+0xc0/0x1c0
[ÂÂ763.659239][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
[ÂÂ763.659239][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdown_trylock+0x14/0x40
[ÂÂ763.659239][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__down_trylock_console_sem+0x2b/0xa0
[ÂÂ763.659240][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂconsole_trylock+0x16/0x60
[ÂÂ763.659240][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_emit+0xf4/0x320
[ÂÂ763.659240][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
[ÂÂ763.659241][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
[ÂÂ763.659241][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂprintk+0x58/0x6f
[ÂÂ763.659242][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__offline_isolated_pages.cold.55+0x38/0x28e
[ÂÂ763.659242][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂoffline_isolated_pages_cb+0x15/0x20
[ÂÂ763.659242][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂwalk_system_ram_range+0x7b/0xd0
[ÂÂ763.659243][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__offline_pages+0x456/0xc10
[ÂÂ763.659243][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂoffline_pages+0x11/0x20
[ÂÂ763.659243][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂmemory_subsys_offline+0x44/0x60
[ÂÂ763.659244][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdevice_offline+0x90/0xc0
[ÂÂ763.659244][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂstate_store+0xbc/0xe0
[ÂÂ763.659244][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdev_attr_store+0x17/0x30
[ÂÂ763.659245][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂsysfs_kf_write+0x4b/0x60
[ÂÂ763.659245][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂkernfs_fop_write+0x119/0x1c0
[ÂÂ763.659245][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__vfs_write+0x1b/0x40
[ÂÂ763.659246][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂvfs_write+0xbd/0x1c0
[ÂÂ763.659246][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂksys_write+0x64/0xe0
[ÂÂ763.659247][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ__x64_sys_write+0x1a/0x20
[ÂÂ763.659247][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂdo_syscall_64+0x6d/0x488
[ÂÂ763.659247][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
[ÂÂ763.659248][ÂÂÂÂC6]Â
[ÂÂ763.659248][ÂÂÂÂC6] other info that might help us debug this:
[ÂÂ763.659248][ÂÂÂÂC6]Â
[ÂÂ763.659248][ÂÂÂÂC6] Chain exists of:
[ÂÂ763.659249][ÂÂÂÂC6]ÂÂÂ(console_sem).lock --> &rq->lock --> &(&zone->lock)-
>rlock
[ÂÂ763.659251][ÂÂÂÂC6]Â
[ÂÂ763.659251][ÂÂÂÂC6]ÂÂPossible unsafe locking scenario:
[ÂÂ763.659251][ÂÂÂÂC6]Â
[ÂÂ763.659252][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂCPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂCPU1
[ÂÂ763.659252][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ----
[ÂÂ763.659252][ÂÂÂÂC6]ÂÂÂlock(&(&zone->lock)->rlock);
[ÂÂ763.659253][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&rq->lock);
[ÂÂ763.659254][ÂÂÂÂC6]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&(&zone->lock)-
>rlock);
[ÂÂ763.659255][ÂÂÂÂC6]ÂÂÂlock((console_sem).lock);
[ÂÂ763.659256][ÂÂÂÂC6]Â
[ÂÂ763.659256][ÂÂÂÂC6]ÂÂ*** DEADLOCK ***
[ÂÂ763.659256][ÂÂÂÂC6]Â
[ÂÂ763.659257][ÂÂÂÂC6] 8 locks held by test.sh/8352:
[ÂÂ763.659257][ÂÂÂÂC6]ÂÂ#0: ffff9bdf4da39408 (sb_writers#4){.+.+}, at:
vfs_write+0x174/0x1c0
[ÂÂ763.659259][ÂÂÂÂC6]ÂÂ#1: ffff9be348280880 (&of->mutex){+.+.}, at:
kernfs_fop_write+0xe4/0x1c0
[ÂÂ763.659260][ÂÂÂÂC6]ÂÂ#2: ffff9bdb873757d0 (kn->count#125){.+.+}, at:
kernfs_fop_write+0xed/0x1c0
[ÂÂ763.659262][ÂÂÂÂC6]ÂÂ#3: ffffffffa194dec0 (device_hotplug_lock){+.+.}, at:
lock_device_hotplug_sysfs+0x15/0x40
[ÂÂ763.659264][ÂÂÂÂC6]ÂÂ#4: ffff9bcf7314c990 (&dev->mutex){....}, at:
device_offline+0x4e/0xc0
[ÂÂ763.659265][ÂÂÂÂC6]ÂÂ#5: ffffffffa185b9f0 (cpu_hotplug_lock.rw_sem){++++},
at: __offline_pages+0x3b/0xc10
[ÂÂ763.659267][ÂÂÂÂC6]ÂÂ#6: ffffffffa18e0b90 (mem_hotplug_lock.rw_sem){++++},
at: percpu_down_write+0x36/0x1c0
[ÂÂ763.659268][ÂÂÂÂC6]ÂÂ#7: ffff9bcf7f373c58 (&(&zone->lock)->rlock){-.-.}, at:
__offline_isolated_pages+0x11e/0x2d0
[ÂÂ763.659270][ÂÂÂÂC6]Â
[ÂÂ763.659270][ÂÂÂÂC6] stack backtrace:
[ÂÂ763.659271][ÂÂÂÂC6] CPU: 6 PID: 8352 Comm: test.sh Not tainted 5.3.0-next-
20190917 #3
[ÂÂ763.659271][ÂÂÂÂC6] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385
Gen10, BIOS A40 07/10/2019
[ÂÂ763.659272][ÂÂÂÂC6] Call Trace:
[ÂÂ763.659272][ÂÂÂÂC6]ÂÂdump_stack+0x70/0x9a
[ÂÂ763.659272][ÂÂÂÂC6]ÂÂprint_circular_bug.cold.31+0x1c0/0x1eb
[ÂÂ763.659273][ÂÂÂÂC6]ÂÂcheck_noncircular+0x18c/0x1a0
[ÂÂ763.659273][ÂÂÂÂC6]ÂÂcheck_prev_add+0x9b/0xa10
[ÂÂ763.659273][ÂÂÂÂC6]ÂÂvalidate_chain+0x759/0xdc0
[ÂÂ763.659274][ÂÂÂÂC6]ÂÂ__lock_acquire+0x44e/0x8c0
[ÂÂ763.659274][ÂÂÂÂC6]ÂÂlock_acquire+0xc0/0x1c0
[ÂÂ763.659274][ÂÂÂÂC6]ÂÂ? down_trylock+0x14/0x40
[ÂÂ763.659275][ÂÂÂÂC6]ÂÂ? vprintk_emit+0xf4/0x320
[ÂÂ763.659275][ÂÂÂÂC6]ÂÂ_raw_spin_lock_irqsave+0x3a/0x50
[ÂÂ763.659275][ÂÂÂÂC6]ÂÂ? down_trylock+0x14/0x40
[ÂÂ763.659276][ÂÂÂÂC6]ÂÂdown_trylock+0x14/0x40
[ÂÂ763.659276][ÂÂÂÂC6]ÂÂ__down_trylock_console_sem+0x2b/0xa0
[ÂÂ763.659276][ÂÂÂÂC6]ÂÂconsole_trylock+0x16/0x60
[ÂÂ763.659277][ÂÂÂÂC6]ÂÂvprintk_emit+0xf4/0x320
[ÂÂ763.659277][ÂÂÂÂC6]ÂÂvprintk_default+0x1f/0x30
[ÂÂ763.659277][ÂÂÂÂC6]ÂÂvprintk_func+0x44/0xd4
[ÂÂ763.659278][ÂÂÂÂC6]ÂÂprintk+0x58/0x6f
[ÂÂ763.659278][ÂÂÂÂC6]ÂÂ__offline_isolated_pages.cold.55+0x38/0x28e
[ÂÂ763.659278][ÂÂÂÂC6]ÂÂ? online_memory_block+0x20/0x20
[ÂÂ763.659279][ÂÂÂÂC6]ÂÂoffline_isolated_pages_cb+0x15/0x20
[ÂÂ763.659279][ÂÂÂÂC6]ÂÂwalk_system_ram_range+0x7b/0xd0
[ÂÂ763.659279][ÂÂÂÂC6]ÂÂ__offline_pages+0x456/0xc10
[ÂÂ763.659280][ÂÂÂÂC6]ÂÂoffline_pages+0x11/0x20
[ÂÂ763.659280][ÂÂÂÂC6]ÂÂmemory_subsys_offline+0x44/0x60
[ÂÂ763.659280][ÂÂÂÂC6]ÂÂdevice_offline+0x90/0xc0
[ÂÂ763.659281][ÂÂÂÂC6]ÂÂstate_store+0xbc/0xe0
[ÂÂ763.659281][ÂÂÂÂC6]ÂÂdev_attr_store+0x17/0x30
[ÂÂ763.659281][ÂÂÂÂC6]ÂÂsysfs_kf_write+0x4b/0x60
[ÂÂ763.659282][ÂÂÂÂC6]ÂÂkernfs_fop_write+0x119/0x1c0
[ÂÂ763.659282][ÂÂÂÂC6]ÂÂ__vfs_write+0x1b/0x40
[ÂÂ763.659282][ÂÂÂÂC6]ÂÂvfs_write+0xbd/0x1c0
[ÂÂ763.659283][ÂÂÂÂC6]ÂÂksys_write+0x64/0xe0
[ÂÂ763.659283][ÂÂÂÂC6]ÂÂ__x64_sys_write+0x1a/0x20
[ÂÂ763.659283][ÂÂÂÂC6]ÂÂdo_syscall_64+0x6d/0x488
[ÂÂ763.659284][ÂÂÂÂC6]ÂÂ? trace_hardirqs_off_thunk+0x1a/0x20
[ÂÂ763.659284][ÂÂÂÂC6]ÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe

>
> This thing does
>
> rq->lock --> zone->lock
>
> It takes rq->lock and then calls into __sched_fork()->hrtimer_init()->debug_objects()->MM
>
> This doesn't look very right - a dive into MM under rq->lock.
>
> Peter, Thomas am I wrong?
>
> > [ÂÂ317.337635] -> #2 (&rq->lock){-.-.}:
> > [ÂÂ317.337638]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
> > [ÂÂ317.337639]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
> > [ÂÂ317.337639]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x2f/0x40
> > [ÂÂ317.337640]ÂÂÂÂÂÂÂÂtask_fork_fair+0x43/0x200
> > [ÂÂ317.337641]ÂÂÂÂÂÂÂÂsched_fork+0x29b/0x420
> > [ÂÂ317.337642]ÂÂÂÂÂÂÂÂcopy_process+0xf3c/0x2fd0
> > [ÂÂ317.337642]ÂÂÂÂÂÂÂÂ_do_fork+0xef/0x950
> > [ÂÂ317.337643]ÂÂÂÂÂÂÂÂkernel_thread+0xa8/0xe0
> > [ÂÂ317.337644]ÂÂÂÂÂÂÂÂrest_init+0x28/0x311
> > [ÂÂ317.337645]ÂÂÂÂÂÂÂÂarch_call_rest_init+0xe/0x1b
> > [ÂÂ317.337645]ÂÂÂÂÂÂÂÂstart_kernel+0x6eb/0x724
> > [ÂÂ317.337646]ÂÂÂÂÂÂÂÂx86_64_start_reservations+0x24/0x26
> > [ÂÂ317.337647]ÂÂÂÂÂÂÂÂx86_64_start_kernel+0xf4/0xfb
> > [ÂÂ317.337648]ÂÂÂÂÂÂÂÂsecondary_startup_64+0xb6/0xc0
>
> pi_lock --> rq->lock
>
> > [ÂÂ317.337649] -> #1 (&p->pi_lock){-.-.}:
> > [ÂÂ317.337651]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
> > [ÂÂ317.337652]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
> > [ÂÂ317.337653]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
> > [ÂÂ317.337653]ÂÂÂÂÂÂÂÂtry_to_wake_up+0xb4/0x1030
> > [ÂÂ317.337654]ÂÂÂÂÂÂÂÂwake_up_process+0x15/0x20
> > [ÂÂ317.337655]ÂÂÂÂÂÂÂÂ__up+0xaa/0xc0
> > [ÂÂ317.337655]ÂÂÂÂÂÂÂÂup+0x55/0x60
> > [ÂÂ317.337656]ÂÂÂÂÂÂÂÂ__up_console_sem+0x37/0x60
> > [ÂÂ317.337657]ÂÂÂÂÂÂÂÂconsole_unlock+0x3a0/0x750
> > [ÂÂ317.337658]ÂÂÂÂÂÂÂÂvprintk_emit+0x10d/0x340
> > [ÂÂ317.337658]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
> > [ÂÂ317.337659]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
> > [ÂÂ317.337660]ÂÂÂÂÂÂÂÂprintk+0x9f/0xc5
> > [ÂÂ317.337660]ÂÂÂÂÂÂÂÂcrng_reseed+0x3cc/0x440
> > [ÂÂ317.337661]ÂÂÂÂÂÂÂÂcredit_entropy_bits+0x3e8/0x4f0
> > [ÂÂ317.337662]ÂÂÂÂÂÂÂÂrandom_ioctl+0x1eb/0x250
> > [ÂÂ317.337663]ÂÂÂÂÂÂÂÂdo_vfs_ioctl+0x13e/0xa70
> > [ÂÂ317.337663]ÂÂÂÂÂÂÂÂksys_ioctl+0x41/0x80
> > [ÂÂ317.337664]ÂÂÂÂÂÂÂÂ__x64_sys_ioctl+0x43/0x4c
> > [ÂÂ317.337665]ÂÂÂÂÂÂÂÂdo_syscall_64+0xcc/0x76c
> > [ÂÂ317.337666]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> console_sem->lock --> pi_lock
>
> This also covers console_sem->lock --> rq->lock, and maintains
> pi_lock --> rq->lock
>
> So we have
>
> console_sem->lock --> pi_lock --> rq->lock
>
> > [ÂÂ317.337667] -> #0 ((console_sem).lock){-.-.}:
> > [ÂÂ317.337669]ÂÂÂÂÂÂÂÂcheck_prev_add+0x107/0xea0
> > [ÂÂ317.337670]ÂÂÂÂÂÂÂÂvalidate_chain+0x8fc/0x1200
> > [ÂÂ317.337671]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b3/0xb40
> > [ÂÂ317.337671]ÂÂÂÂÂÂÂÂlock_acquire+0x126/0x280
> > [ÂÂ317.337672]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x3a/0x50
> > [ÂÂ317.337673]ÂÂÂÂÂÂÂÂdown_trylock+0x16/0x50
> > [ÂÂ317.337674]ÂÂÂÂÂÂÂÂ__down_trylock_console_sem+0x2b/0xa0
> > [ÂÂ317.337675]ÂÂÂÂÂÂÂÂconsole_trylock+0x16/0x60
> > [ÂÂ317.337676]ÂÂÂÂÂÂÂÂvprintk_emit+0x100/0x340
> > [ÂÂ317.337677]ÂÂÂÂÂÂÂÂvprintk_default+0x1f/0x30
> > [ÂÂ317.337678]ÂÂÂÂÂÂÂÂvprintk_func+0x44/0xd4
> > [ÂÂ317.337678]ÂÂÂÂÂÂÂÂprintk+0x9f/0xc5
> > [ÂÂ317.337679]ÂÂÂÂÂÂÂÂ__dump_page.cold.2+0x73/0x210
> > [ÂÂ317.337680]ÂÂÂÂÂÂÂÂdump_page+0x12/0x50
> > [ÂÂ317.337680]ÂÂÂÂÂÂÂÂhas_unmovable_pages+0x3e9/0x4b0
> > [ÂÂ317.337681]ÂÂÂÂÂÂÂÂstart_isolate_page_range+0x3b4/0x570
> > [ÂÂ317.337682]ÂÂÂÂÂÂÂÂ__offline_pages+0x1ad/0xa10
> > [ÂÂ317.337683]ÂÂÂÂÂÂÂÂoffline_pages+0x11/0x20
> > [ÂÂ317.337683]ÂÂÂÂÂÂÂÂmemory_subsys_offline+0x7e/0xc0
> > [ÂÂ317.337684]ÂÂÂÂÂÂÂÂdevice_offline+0xd5/0x110
> > [ÂÂ317.337685]ÂÂÂÂÂÂÂÂstate_store+0xc6/0xe0
> > [ÂÂ317.337686]ÂÂÂÂÂÂÂÂdev_attr_store+0x3f/0x60
> > [ÂÂ317.337686]ÂÂÂÂÂÂÂÂsysfs_kf_write+0x89/0xb0
> > [ÂÂ317.337687]ÂÂÂÂÂÂÂÂkernfs_fop_write+0x188/0x240
> > [ÂÂ317.337688]ÂÂÂÂÂÂÂÂ__vfs_write+0x50/0xa0
> > [ÂÂ317.337688]ÂÂÂÂÂÂÂÂvfs_write+0x105/0x290
> > [ÂÂ317.337689]ÂÂÂÂÂÂÂÂksys_write+0xc6/0x160
> > [ÂÂ317.337690]ÂÂÂÂÂÂÂÂ__x64_sys_write+0x43/0x50
> > [ÂÂ317.337691]ÂÂÂÂÂÂÂÂdo_syscall_64+0xcc/0x76c
> > [ÂÂ317.337691]ÂÂÂÂÂÂÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> zone->lock --> console_sem->lock
>
> So then we have
>
> zone->lock --> console_sem->lock --> pi_lock --> rq->lock
>
> vs. the reverse chain
>
> rq->lock --> console_sem->lock
>
> If I get this right.
>
> > [ÂÂ317.337693] other info that might help us debug this:
> >
> > [ÂÂ317.337694] Chain exists of:
> > [ÂÂ317.337694]ÂÂÂ(console_sem).lock --> &rq->lock --> &(&zone->lock)->rlock
> >
> > [ÂÂ317.337699]ÂÂPossible unsafe locking scenario:
> >
> > [ÂÂ317.337700]ÂÂÂÂÂÂÂÂCPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂCPU1
> > [ÂÂ317.337701]ÂÂÂÂÂÂÂÂ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ----
> > [ÂÂ317.337701]ÂÂÂlock(&(&zone->lock)->rlock);
> > [ÂÂ317.337703]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&rq->lock);
> > [ÂÂ317.337705]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&(&zone->lock)->rlock);
> > [ÂÂ317.337706]ÂÂÂlock((console_sem).lock);
> >
> > [ÂÂ317.337708]ÂÂ*** DEADLOCK ***
> >
> > [ÂÂ317.337710] 8 locks held by test.sh/8738:
> > [ÂÂ317.337710]ÂÂ#0: ffff8883940b5408 (sb_writers#4){.+.+}, at: vfs_write+0x25f/0x290
> > [ÂÂ317.337713]ÂÂ#1: ffff889fce310280 (&of->mutex){+.+.}, at: kernfs_fop_write+0x128/0x240
> > [ÂÂ317.337716]ÂÂ#2: ffff889feb6d4830 (kn->count#115){.+.+}, at: kernfs_fop_write+0x138/0x240
> > [ÂÂ317.337720]ÂÂ#3: ffffffffb3762d40 (device_hotplug_lock){+.+.}, at: lock_device_hotplug_sysfs+0x16/0x50
> > [ÂÂ317.337723]ÂÂ#4: ffff88981f0dc990 (&dev->mutex){....}, at: device_offline+0x70/0x110
> > [ÂÂ317.337726]ÂÂ#5: ffffffffb3315250 (cpu_hotplug_lock.rw_sem){++++}, at: __offline_pages+0xbf/0xa10
> > [ÂÂ317.337729]ÂÂ#6: ffffffffb35408b0 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x87/0x2f0
> > [ÂÂ317.337732]ÂÂ#7: ffff88883fff4318 (&(&zone->lock)->rlock){-.-.}, at: start_isolate_page_range+0x1f7/0x570
> > [ÂÂ317.337736] stack backtrace:
> > [ÂÂ317.337737] CPU: 58 PID: 8738 Comm: test.sh Not tainted 5.3.0-next-20190917+ #9
> > [ÂÂ317.337738] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 05/21/2019
> > [ÂÂ317.337739] Call Trace:
> > [ÂÂ317.337739]ÂÂdump_stack+0x86/0xca
> > [ÂÂ317.337740]ÂÂprint_circular_bug.cold.31+0x243/0x26e
> > [ÂÂ317.337741]ÂÂcheck_noncircular+0x29e/0x2e0
> > [ÂÂ317.337742]ÂÂ? debug_lockdep_rcu_enabled+0x4b/0x60
> > [ÂÂ317.337742]ÂÂ? print_circular_bug+0x120/0x120
> > [ÂÂ317.337743]ÂÂ? is_ftrace_trampoline+0x9/0x20
> > [ÂÂ317.337744]ÂÂ? kernel_text_address+0x59/0xc0
> > [ÂÂ317.337744]ÂÂ? __kernel_text_address+0x12/0x40
> > [ÂÂ317.337745]ÂÂcheck_prev_add+0x107/0xea0
> > [ÂÂ317.337746]ÂÂvalidate_chain+0x8fc/0x1200
> > [ÂÂ317.337746]ÂÂ? check_prev_add+0xea0/0xea0
> > [ÂÂ317.337747]ÂÂ? format_decode+0xd6/0x600
> > [ÂÂ317.337748]ÂÂ? file_dentry_name+0xe0/0xe0
> > [ÂÂ317.337749]ÂÂ__lock_acquire+0x5b3/0xb40
> > [ÂÂ317.337749]ÂÂlock_acquire+0x126/0x280
> > [ÂÂ317.337750]ÂÂ? down_trylock+0x16/0x50
> > [ÂÂ317.337751]ÂÂ? vprintk_emit+0x100/0x340
> > [ÂÂ317.337752]ÂÂ_raw_spin_lock_irqsave+0x3a/0x50
> > [ÂÂ317.337753]ÂÂ? down_trylock+0x16/0x50
> > [ÂÂ317.337753]ÂÂdown_trylock+0x16/0x50
> > [ÂÂ317.337754]ÂÂ? vprintk_emit+0x100/0x340
> > [ÂÂ317.337755]ÂÂ__down_trylock_console_sem+0x2b/0xa0
> > [ÂÂ317.337756]ÂÂconsole_trylock+0x16/0x60
> > [ÂÂ317.337756]ÂÂvprintk_emit+0x100/0x340
> > [ÂÂ317.337757]ÂÂvprintk_default+0x1f/0x30
> > [ÂÂ317.337758]ÂÂvprintk_func+0x44/0xd4
> > [ÂÂ317.337758]ÂÂprintk+0x9f/0xc5
> > [ÂÂ317.337759]ÂÂ? kmsg_dump_rewind_nolock+0x64/0x64
> > [ÂÂ317.337760]ÂÂ? __dump_page+0x1d7/0x430
> > [ÂÂ317.337760]ÂÂ__dump_page.cold.2+0x73/0x210
> > [ÂÂ317.337761]ÂÂdump_page+0x12/0x50
> > [ÂÂ317.337762]ÂÂhas_unmovable_pages+0x3e9/0x4b0
> > [ÂÂ317.337763]ÂÂstart_isolate_page_range+0x3b4/0x570
> > [ÂÂ317.337763]ÂÂ? unset_migratetype_isolate+0x280/0x280
> > [ÂÂ317.337764]ÂÂ? rcu_read_lock_bh_held+0xc0/0xc0
> > [ÂÂ317.337765]ÂÂ__offline_pages+0x1ad/0xa10
> > [ÂÂ317.337765]ÂÂ? lock_acquire+0x126/0x280
> > [ÂÂ317.337766]ÂÂ? __add_memory+0xc0/0xc0
> > [ÂÂ317.337767]ÂÂ? __kasan_check_write+0x14/0x20
> > [ÂÂ317.337767]ÂÂ? __mutex_lock+0x344/0xcd0
> > [ÂÂ317.337768]ÂÂ? _raw_spin_unlock_irqrestore+0x49/0x50
> > [ÂÂ317.337769]ÂÂ? device_offline+0x70/0x110
> > [ÂÂ317.337770]ÂÂ? klist_next+0x1c1/0x1e0
> > [ÂÂ317.337770]ÂÂ? __mutex_add_waiter+0xc0/0xc0
> > [ÂÂ317.337771]ÂÂ? klist_next+0x10b/0x1e0
> > [ÂÂ317.337772]ÂÂ? klist_iter_exit+0x16/0x40
> > [ÂÂ317.337772]ÂÂ? device_for_each_child+0xd0/0x110
> > [ÂÂ317.337773]ÂÂoffline_pages+0x11/0x20
> > [ÂÂ317.337774]ÂÂmemory_subsys_offline+0x7e/0xc0
> > [ÂÂ317.337774]ÂÂdevice_offline+0xd5/0x110
> > [ÂÂ317.337775]ÂÂ? auto_online_blocks_show+0x70/0x70
> > [ÂÂ317.337776]ÂÂstate_store+0xc6/0xe0
> > [ÂÂ317.337776]ÂÂdev_attr_store+0x3f/0x60
> > [ÂÂ317.337777]ÂÂ? device_match_name+0x40/0x40
> > [ÂÂ317.337778]ÂÂsysfs_kf_write+0x89/0xb0
> > [ÂÂ317.337778]ÂÂ? sysfs_file_ops+0xa0/0xa0
> > [ÂÂ317.337779]ÂÂkernfs_fop_write+0x188/0x240
> > [ÂÂ317.337780]ÂÂ__vfs_write+0x50/0xa0
> > [ÂÂ317.337780]ÂÂvfs_write+0x105/0x290
> > [ÂÂ317.337781]ÂÂksys_write+0xc6/0x160
> > [ÂÂ317.337782]ÂÂ? __x64_sys_read+0x50/0x50
> > [ÂÂ317.337782]ÂÂ? do_syscall_64+0x79/0x76c
> > [ÂÂ317.337783]ÂÂ? do_syscall_64+0x79/0x76c
> > [ÂÂ317.337784]ÂÂ__x64_sys_write+0x43/0x50
> > [ÂÂ317.337784]ÂÂdo_syscall_64+0xcc/0x76c
> > [ÂÂ317.337785]ÂÂ? trace_hardirqs_on_thunk+0x1a/0x20
> > [ÂÂ317.337786]ÂÂ? syscall_return_slowpath+0x210/0x210
> > [ÂÂ317.337787]ÂÂ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > [ÂÂ317.337787]ÂÂ? trace_hardirqs_off_caller+0x3a/0x150
> > [ÂÂ317.337788]ÂÂ? trace_hardirqs_off_thunk+0x1a/0x20
> > [ÂÂ317.337789]ÂÂentry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Lovely.
>
> -ss