Re: mm lock issue while booting Linux on 5.8-rc1 for RISC-V

From: Stafford Horne
Date: Tue Jun 16 2020 - 00:51:15 EST


On Tue, Jun 16, 2020 at 06:57:47AM +0900, Stafford Horne wrote:
> On Mon, Jun 15, 2020 at 12:28:11AM -0700, Atish Patra wrote:
> > Hi,
> > I encountered the following issue while booting 5.8-rc1 on Qemu for RV64.
> > I added additional dump_stack and observed that it's happening in bpf free path.
> > It happens always if CONFIG_DEBUG_VM is enabled. VM_BUG_ON_MM is
> > compiled away without that.
> > ------------------------------------------------------------------------
> > forked to background, child pid 113
> > [ 10.328850] CPU: 3 PID: 51 Comm: kworker/3:1 Not tainted
> > 5.8.0-rc1-dirty #732
> > [ 10.331739] Workqueue: events bpf_prog_free_deferred
> > [ 10.334133] Call Trace:
> > [ 10.338039] [<ffffffe000202698>] walk_stackframe+0x0/0xa4
> > [ 10.339988] [<ffffffe000202880>] show_stack+0x2e/0x3a
> > [ 10.340902] [<ffffffe00047074c>] dump_stack+0x72/0x8c
> > [ 10.341451] [<ffffffe0002db4ce>] mmap_assert_locked.part.13+0x14/0x1c
> > [ 10.342131] [<ffffffe0002db330>] walk_page_range_novma+0x0/0x4e
> > [ 10.342973] [<ffffffe000204f94>] set_direct_map_invalid_noflush+0x66/0x6e
> > [ 10.343917] [<ffffffe0002e0706>] __vunmap+0xe8/0x212
> > [ 10.344680] [<ffffffe0002e0882>] __vfree+0x22/0x6e
> > [ 10.345270] [<ffffffe0002e0902>] vfree+0x34/0x56
> > [ 10.345834] [<ffffffe00027d752>] __bpf_prog_free+0x2c/0x36
> > [ 10.346529] [<ffffffe0002801a2>] bpf_prog_free_deferred+0x74/0x8a
> > [ 10.347394] [<ffffffe000219c70>] process_one_work+0x13a/0x272
> > [ 10.348239] [<ffffffe00021a4b4>] worker_thread+0x50/0x2e4
> > [ 10.348900] [<ffffffe00021ed98>] kthread+0xfc/0x10a
> > [ 10.349470] [<ffffffe0002013da>] ret_from_exception+0x0/0xc
> > [ 10.354405] mm ffffffe001018600 mmap 0000000000000000 seqnum 0 task_size 0
> > [ 10.354405] get_unmapped_area 0000000000000000
> > [ 10.354405] mmap_base 0 mmap_legacy_base 0 highest_vm_end 0
> > [ 10.354405] pgd ffffffe001074000 mm_users 2 mm_count 1
> > pgtables_bytes 8192 map_count 0
> > [ 10.354405] hiwater_rss 0 hiwater_vm 0 total_vm 0 locked_vm 0
> > [ 10.354405] pinned_vm 0 data_vm 0 exec_vm 0 stack_vm 0
> > [ 10.354405] start_code ffffffe000200000 end_code ffffffe00084acc2
> > start_data 0 end_data ffffffe00106dfe4
> > [ 10.354405] start_brk 0 brk ffffffe0010bd6d0 start_stack 0
> > [ 10.354405] arg_start 0 arg_end 0 env_start 0 env_end 0
> > [ 10.354405] binfmt 0000000000000000 flags 0 core_state 0000000000000000
> > [ 10.354405] ioctx_table 0000000000000000
> > [ 10.354405] exe_file 0000000000000000
> > [ 10.354405] tlb_flush_pending 0
> > [ 10.354405] def_flags: 0x0()
> > [ 10.369325] ------------[ cut here ]------------
> > [ 10.370763] kernel BUG at include/linux/mmap_lock.h:81!
> > [ 10.375235] Kernel BUG [#1]
> > [ 10.377198] Modules linked in:
> > [ 10.378931] CPU: 3 PID: 51 Comm: kworker/3:1 Not tainted 5.8.0-rc1-dirty #732
> > [ 10.380179] Workqueue: events bpf_prog_free_deferred
> > [ 10.381270] epc: ffffffe0002db4d4 ra : ffffffe0002db4d4 sp : ffffffe3eaea7c70
> > [ 10.382561] gp : ffffffe00106d950 tp : ffffffe3ef752f80 t0 :
> > ffffffe0010836e8
> > [ 10.383996] t1 : 0000000000000064 t2 : 0000000000000000 s0 :
> > ffffffe3eaea7c90
> > [ 10.385119] s1 : ffffffe001018600 a0 : 0000000000000289 a1 :
> > 0000000000000020
> > [ 10.386099] a2 : 0000000000000005 a3 : 0000000000000000 a4 :
> > ffffffe001012758
> > [ 10.387294] a5 : 0000000000000000 a6 : 0000000000000102 a7 :
> > 0000000000000006
> > [ 10.388265] s2 : ffffffe3f00674c0 s3 : ffffffe00106e108 s4 :
> > ffffffe00106e100
> > [ 10.389250] s5 : ffffffe00106e908 s6 : 0000000000000000 s7 :
> > 6db6db6db6db6db7
> > [ 10.390272] s8 : 0000000000000001 s9 : ffffffe00021a4f8 s10:
> > ffffffffffffffff
> > [ 10.391293] s11: ffffffe3f0066600 t3 : 000000000001a7a8 t4 :
> > 000000000001a7a8
> > [ 10.392314] t5 : 0000000000000000 t6 : ffffffe00107b76b
> > [ 10.393096] status: 0000000000000120 badaddr: 0000000000000000
> > cause: 0000000000000003
> > [ 10.397755] ---[ end trace 861659596ac28841 ]---
> > ---------------------------------------------------------------------------------------------------
> >
> > I haven't had the chance to bisect to figure out which commit caused
> > the issue. Just wanted
> > to check if it is a known issue already.
>
> Hi Atish,
>
> Note, I am getting the same (just now) when booting v5.8-rc1 on OpenRISC. If
> you have any updates please post back. I will try to look into this today or
> tomorrow.

I have bisected this to, 42fc541404f249778e752ab39c8bc25fcb2dbe1e:

mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()

This should have just changed the existing lockdep api's but something has
changed. I haven't had time to look at it yet.

Ccing: Michel Lespinasse <walken@xxxxxxxxxx>

-Stafford