Re: "Fix data race in tty_insert_flip_string_fixed_flag" triggers a lockdep warning

From: Qian Cai
Date: Thu Oct 31 2019 - 11:22:18 EST


On Wed, 2019-10-30 at 17:40 -0400, Qian Cai wrote:
> The commit b6da31b2c07c ("tty: Fix data race in
> tty_insert_flip_string_fixed_flag") creates a locking order,
>
> &(&port->lock)->rlock --> &(&zone->lock)->rlock
>
> which will trigger a lockdep warning below.
>
> [ÂÂ489.312784] WARNING: possible circular locking dependency detected
> [ÂÂ489.319655] 5.4.0-rc5-next-20191030+ #2 Tainted: GÂÂÂÂÂÂÂÂWÂÂÂÂLÂÂÂ
> [ÂÂ489.326610] ------------------------------------------------------
> [ÂÂ489.333479] sshd/5541 is trying to acquire lock:
> [ÂÂ489.338785] ffff008b7cff9290 (&(&zone->lock)->rlock){..-.}, at:
> rmqueue+0xfe8/0x2274
> [ÂÂ489.347230]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂbut task is already holding lock:
> [ÂÂ489.354444] 1cff0009e39c0540 (&(&port->lock)->rlock){-.-.}, at:
> pty_write+0x60/0xd0
> [ÂÂ489.362799]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂwhich lock already depends on the new lock.
>
> [ÂÂ489.373052]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂthe existing dependency chain (in reverse order) is:
> [ÂÂ489.381916]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ-> #3 (&(&port->lock)->rlock){-.-.}:
> [ÂÂ489.389403]ÂÂÂÂÂÂÂÂlock_acquire+0x320/0x360
> [ÂÂ489.394279]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x7c/0x9c
> [ÂÂ489.399849]ÂÂÂÂÂÂÂÂtty_port_tty_get+0x24/0x60
> [ÂÂ489.404896]ÂÂÂÂÂÂÂÂtty_port_default_wakeup+0x1c/0x3c
> [ÂÂ489.410551]ÂÂÂÂÂÂÂÂtty_port_tty_wakeup+0x34/0x40
> [ÂÂ489.415859]ÂÂÂÂÂÂÂÂuart_write_wakeup+0x28/0x44
> [ÂÂ489.420993]ÂÂÂÂÂÂÂÂpl011_tx_chars+0x1b8/0x270
> [ÂÂ489.426040]ÂÂÂÂÂÂÂÂpl011_start_tx+0x24/0x70
> [ÂÂ489.430913]ÂÂÂÂÂÂÂÂ__uart_start+0x5c/0x68
> [ÂÂ489.435612]ÂÂÂÂÂÂÂÂuart_write+0x164/0x1c8
> [ÂÂ489.440312]ÂÂÂÂÂÂÂÂdo_output_char+0x33c/0x348
> [ÂÂ489.445358]ÂÂÂÂÂÂÂÂn_tty_write+0x4bc/0x60c
> [ÂÂ489.450143]ÂÂÂÂÂÂÂÂtty_write+0x338/0x474
> [ÂÂ489.454755]ÂÂÂÂÂÂÂÂredirected_tty_write+0xc0/0xdc
> [ÂÂ489.460150]ÂÂÂÂÂÂÂÂdo_loop_readv_writev+0x140/0x180
> [ÂÂ489.465718]ÂÂÂÂÂÂÂÂdo_iter_write+0xe0/0x10c
> [ÂÂ489.470590]ÂÂÂÂÂÂÂÂvfs_writev+0x134/0x1cc
> [ÂÂ489.475289]ÂÂÂÂÂÂÂÂdo_writev+0xbc/0x130
> [ÂÂ489.479815]ÂÂÂÂÂÂÂÂ__arm64_sys_writev+0x58/0x8c
> [ÂÂ489.485038]ÂÂÂÂÂÂÂÂel0_svc_handler+0x170/0x240
> [ÂÂ489.490173]ÂÂÂÂÂÂÂÂel0_sync_handler+0x150/0x250
> [ÂÂ489.495393]ÂÂÂÂÂÂÂÂel0_sync+0x164/0x180
> [ÂÂ489.499915]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ-> #2 (&port_lock_key){-.-.}:
> [ÂÂ489.506793]ÂÂÂÂÂÂÂÂlock_acquire+0x320/0x360
> [ÂÂ489.511666]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x64/0x80
> [ÂÂ489.516540]ÂÂÂÂÂÂÂÂpl011_console_write+0xec/0x2cc
> [ÂÂ489.521934]ÂÂÂÂÂÂÂÂconsole_unlock+0x794/0x96c
> [ÂÂ489.526981]ÂÂÂÂÂÂÂÂvprintk_emit+0x260/0x31c
> [ÂÂ489.531854]ÂÂÂÂÂÂÂÂvprintk_default+0x54/0x7c
> [ÂÂ489.536813]ÂÂÂÂÂÂÂÂvprintk_func+0x218/0x254
> [ÂÂ489.541687]ÂÂÂÂÂÂÂÂprintk+0x7c/0xa4
>
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ-> #1 (console_owner){-...}:
> [ÂÂ489.703613]ÂÂÂÂÂÂÂÂlock_acquire+0x320/0x360
> [ÂÂ489.708486]ÂÂÂÂÂÂÂÂconsole_lock_spinning_enable+0x6c/0x7c
> [ÂÂ489.714575]ÂÂÂÂÂÂÂÂconsole_unlock+0x4f8/0x96c
> [ÂÂ489.719622]ÂÂÂÂÂÂÂÂvprintk_emit+0x260/0x31c
> [ÂÂ489.724495]ÂÂÂÂÂÂÂÂvprintk_default+0x54/0x7c
> [ÂÂ489.729454]ÂÂÂÂÂÂÂÂvprintk_func+0x218/0x254
> [ÂÂ489.734327]ÂÂÂÂÂÂÂÂprintk+0x7c/0xa4
> [ÂÂ489.810371]Â

Correction -- the call traces of the locking order from,

&(&zone->lock)->rlock --> console_owner:

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ-> #1 (console_owner){-...}:
[ÂÂ489.703613]ÂÂÂÂÂÂÂÂlock_acquire+0x320/0x360
[ÂÂ489.708486]ÂÂÂÂÂÂÂÂconsole_lock_spinning_enable+0x6c/0x7c
[ÂÂ489.714575]ÂÂÂÂÂÂÂÂconsole_unlock+0x4f8/0x96c
[ÂÂ489.719622]ÂÂÂÂÂÂÂÂvprintk_emit+0x260/0x31c
[ÂÂ489.724495]ÂÂÂÂÂÂÂÂvprintk_default+0x54/0x7c
[ÂÂ489.729454]ÂÂÂÂÂÂÂÂvprintk_func+0x218/0x254
[ÂÂ489.734327]ÂÂÂÂÂÂÂÂprintk+0x7c/0xa4
[ÂÂ489.738506]ÂÂÂÂÂÂÂÂget_random_u64+0x1c4/0x1dc
[ÂÂ489.743552]ÂÂÂÂÂÂÂÂadd_to_free_area_random+0x50/0x16c
[ÂÂ489.749292]ÂÂÂÂÂÂÂÂ__free_one_page+0x374/0x6a0
[ÂÂ489.754425]ÂÂÂÂÂÂÂÂfree_one_page+0x6c/0x11c
[ÂÂ489.810371]Â

Should b6da31b2c07c ("tty: Fix data race in tty_insert_flip_string_fixed_flag")
be reverted first while taking time to rework?

> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ-> #0 (&(&zone->lock)->rlock){..-.}:
> [ÂÂ489.817854]ÂÂÂÂÂÂÂÂvalidate_chain+0xf6c/0x2e2c
> [ÂÂ489.822988]ÂÂÂÂÂÂÂÂ__lock_acquire+0x868/0xc2c
> [ÂÂ489.828035]ÂÂÂÂÂÂÂÂlock_acquire+0x320/0x360
> [ÂÂ489.832909]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x7c/0x9c
> [ÂÂ489.838475]ÂÂÂÂÂÂÂÂrmqueue+0xfe8/0x2274
> [ÂÂ489.843002]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x474/0x688
> [ÂÂ489.848743]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x3b8/0x1928
> [ÂÂ489.854573]ÂÂÂÂÂÂÂÂalloc_pages_current+0xd0/0xe0
> [ÂÂ489.859882]ÂÂÂÂÂÂÂÂstack_depot_save+0x300/0x440
> [ÂÂ489.865103]ÂÂÂÂÂÂÂÂ__kasan_kmalloc+0x198/0x1e0
> [ÂÂ489.870236]ÂÂÂÂÂÂÂÂkasan_slab_alloc+0x18/0x20
> [ÂÂ489.875284]ÂÂÂÂÂÂÂÂslab_post_alloc_hook+0x48/0x9c
> [ÂÂ489.880677]ÂÂÂÂÂÂÂÂkmem_cache_alloc+0x2e8/0x588
> [ÂÂ489.885898]ÂÂÂÂÂÂÂÂcreate_object+0x60/0x698
> [ÂÂ489.890771]ÂÂÂÂÂÂÂÂkmemleak_alloc+0x78/0xb8
> [ÂÂ489.895644]ÂÂÂÂÂÂÂÂslab_post_alloc_hook+0x64/0x9c
> [ÂÂ489.901038]ÂÂÂÂÂÂÂÂ__kmalloc+0x328/0x490
> [ÂÂ489.905652]ÂÂÂÂÂÂÂÂ__tty_buffer_request_room+0x118/0x1f8
> [ÂÂ489.911654]ÂÂÂÂÂÂÂÂtty_insert_flip_string_fixed_flag+0x6c/0x144
> [ÂÂ489.918263]ÂÂÂÂÂÂÂÂpty_write+0x80/0xd0
> [ÂÂ489.922701]ÂÂÂÂÂÂÂÂn_tty_write+0x450/0x60c
> [ÂÂ489.927486]ÂÂÂÂÂÂÂÂtty_write+0x338/0x474
> [ÂÂ489.932098]ÂÂÂÂÂÂÂÂ__vfs_write+0x88/0x214
> [ÂÂ489.936796]ÂÂÂÂÂÂÂÂvfs_write+0x12c/0x1a4
> [ÂÂ489.941411]ÂÂÂÂÂÂÂÂksys_write+0xb0/0x120
> [ÂÂ489.946027]ÂÂÂÂÂÂÂÂ__arm64_sys_write+0x54/0x88
> [ÂÂ489.951161]ÂÂÂÂÂÂÂÂel0_svc_handler+0x170/0x240
> [ÂÂ489.956294]ÂÂÂÂÂÂÂÂel0_sync_handler+0x150/0x250
> [ÂÂ489.961514]ÂÂÂÂÂÂÂÂel0_sync+0x164/0x180
> [ÂÂ489.966038]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂother info that might help us debug this:
>
> [ÂÂ489.976117] Chain exists of:
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ&(&zone->lock)->rlock --> &port_lock_key --> &(&port->lock)-
> > rlock
>
> [ÂÂ489.989937]ÂÂPossible unsafe locking scenario:
>
> [ÂÂ489.997239]ÂÂÂÂÂÂÂÂCPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂCPU1
> [ÂÂ490.002456]ÂÂÂÂÂÂÂÂ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ----
> [ÂÂ490.007673]ÂÂÂlock(&(&port->lock)->rlock);
> [ÂÂ490.012459]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&port_lock_key);
> [ÂÂ490.019155]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂlock(&(&port->lock)->rlock);
> [ÂÂ490.026458]ÂÂÂlock(&(&zone->lock)->rlock);
> [ÂÂ490.031244]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ*** DEADLOCK ***
>
> [ÂÂ490.039241] 5 locks held by sshd/5541:
> [ÂÂ490.043677]ÂÂ#0: 12ff0009eb0c6890 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x3c/0x48
> [ÂÂ490.052465]ÂÂ#1: 12ff0009eb0c6918 (&tty->atomic_write_lock){+.+.}, at:
> tty_write+0x12c/0x474
> [ÂÂ490.061599]ÂÂ#2: 12ff0009eb0c6aa0 (&o_tty->termios_rwsem/1){++++}, at:
> n_tty_write+0x10c/0x60c
> [ÂÂ490.070908]ÂÂ#3: ffff90001f852360 (&ldata->output_lock){+.+.}, at:
> n_tty_write+0x278/0x60c
> [ÂÂ490.079868]ÂÂ#4: 1cff0009e39c0540 (&(&port->lock)->rlock){-.-.}, at:
> pty_write+0x60/0xd0
> [ÂÂ490.088654]Â
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂstack backtrace:
> [ÂÂ490.094398] CPU: 31 PID: 5541 Comm: sshd Tainted: GÂÂÂÂÂÂÂÂWÂÂÂÂLÂÂÂÂ5.4.0-
> rc5-next-20191030+ #2
> [ÂÂ490.103871] Hardware name: HPE Apollo 70ÂÂÂÂÂÂÂÂÂÂÂÂÂ/C01_APACHE_MBÂÂÂÂÂÂÂÂÂ,
> BIOS L50_5.13_1.11 06/18/2019
> [ÂÂ490.114300] Call trace:
> [ÂÂ490.117438]ÂÂdump_backtrace+0x0/0x248
> [ÂÂ490.121791]ÂÂshow_stack+0x20/0x2c
> [ÂÂ490.125798]ÂÂdump_stack+0xc8/0x130
> [ÂÂ490.129890]ÂÂprint_circular_bug+0x368/0x380
> [ÂÂ490.134764]ÂÂcheck_noncircular+0x28c/0x294
> [ÂÂ490.139550]ÂÂvalidate_chain+0xf6c/0x2e2c
> [ÂÂ490.144164]ÂÂ__lock_acquire+0x868/0xc2c
> [ÂÂ490.148691]ÂÂlock_acquire+0x320/0x360
> [ÂÂ490.153044]ÂÂ_raw_spin_lock_irqsave+0x7c/0x9c
> [ÂÂ490.158090]ÂÂrmqueue+0xfe8/0x2274
> [ÂÂ490.162097]ÂÂget_page_from_freelist+0x474/0x688
> [ÂÂ490.167317]ÂÂ__alloc_pages_nodemask+0x3b8/0x1928
> [ÂÂ490.172625]ÂÂalloc_pages_current+0xd0/0xe0
> [ÂÂ490.177412]ÂÂstack_depot_save+0x300/0x440
> [ÂÂ490.182112]ÂÂ__kasan_kmalloc+0x198/0x1e0
> [ÂÂ490.186723]ÂÂkasan_slab_alloc+0x18/0x20
> [ÂÂ490.191249]ÂÂslab_post_alloc_hook+0x48/0x9c
> [ÂÂ490.196121]ÂÂkmem_cache_alloc+0x2e8/0x588
> [ÂÂ490.200820]ÂÂcreate_object+0x60/0x698
> [ÂÂ490.205173]ÂÂkmemleak_alloc+0x78/0xb8
> [ÂÂ490.209525]ÂÂslab_post_alloc_hook+0x64/0x9c
> [ÂÂ490.214398]ÂÂ__kmalloc+0x328/0x490
> [ÂÂ490.218491]ÂÂ__tty_buffer_request_room+0x118/0x1f8
> [ÂÂ490.223972]ÂÂtty_insert_flip_string_fixed_flag+0x6c/0x144
> [ÂÂ490.230059]ÂÂpty_write+0x80/0xd0
> [ÂÂ490.233977]ÂÂn_tty_write+0x450/0x60c
> [ÂÂ490.238241]ÂÂtty_write+0x338/0x474
> [ÂÂ490.242333]ÂÂ__vfs_write+0x88/0x214
> [ÂÂ490.246511]ÂÂvfs_write+0x12c/0x1a4
> [ÂÂ490.250601]ÂÂksys_write+0xb0/0x120
> [ÂÂ490.254693]ÂÂ__arm64_sys_write+0x54/0x88
> [ÂÂ490.259305]ÂÂel0_svc_handler+0x170/0x240
> [ÂÂ490.263918]ÂÂel0_sync_handler+0x150/0x250
> [ÂÂ490.268617]ÂÂel0_sync+0x164/0x180