Re: splice: infinite busy loop lockup bug

From: Qian Cai
Date: Mon Aug 31 2020 - 21:48:08 EST


On Tue, Sep 01, 2020 at 02:09:28AM +0100, Al Viro wrote:
> On Mon, Aug 31, 2020 at 08:51:32PM -0400, Qian Cai wrote:
> > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote:
> > > On 2020/08/07 21:27, Al Viro wrote:
> > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote:
> > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from
> > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from
> > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from
> > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from
> > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop
> > > >> with pipe->mutex held.
> > > >>
> > > >> The reason of falling into infinite busy loop is that iter_file_splice_write()
> > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0
> > > >> while for_each_bvec() cannot handle .bv_len == 0.
> > > >
> > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()",
> > > > unless I'm misreading it...
> >
> > I have been chasing something similar for a while as in,
> >
> > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@xxxxxx/
> >
> > In my case, it seems the endless loop happens in iterate_iovec() instead where
> > I put a debug patch here,
> >
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -33,6 +33,7 @@
> > if (unlikely(!__v.iov_len)) \
> > continue; \
> > __v.iov_base = __p->iov_base; \
> > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \
> > left = (STEP); \
> > __v.iov_len -= left; \
> > skip = __v.iov_len; \
> >
> > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the
> > soft-lockups and a dead system,
> >
> > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423
> >
> > Thoughts?
>
> Er... Where does that size come from? If that's generic_perform_write(),
> I'd like to see pos, offset and bytes at the time of call... ->iov_offset would
> also be interesting to see (along with the entire iovec array, really).

Yes, generic_perform_write(). I'll see if I can capture more information.

[ 2867.463013][T217919] ITER_IOVEC left = 0, n = 2209
[ 2867.466154][T217971] ITER_IOVEC left = 0, n = 2093279
[ 2867.903689][T217971] ITER_IOVEC left = 0, n = 2093259
[ 2867.928514][T217971] ITER_IOVEC left = 0, n = 2093239
[ 2867.952450][T217971] ITER_IOVEC left = 0, n = 2090980
[ 2867.976585][T217971] ITER_IOVEC left = 0, n = 2090960
[ 2869.219459][T217774] futex_wake_op: trinity-c61 tries to shift op by -1; fix this program
[ 2870.005178][T218110] futex_wake_op: trinity-c9 tries to shift op by -1; fix this program
[ 2870.297607][T218213] futex_wake_op: trinity-c4 tries to shift op by -836; fix this program
[ 2870.338118][T218213] futex_wake_op: trinity-c4 tries to shift op by -836; fix this program
[-- MARK -- Mon Aug 31 18:50:00 2020]
[ 2893.870387][ C62] watchdog: BUG: soft lockup - CPU#62 stuck for 23s! [trinity-c5:218155]
[ 2893.912341][ C62] Modules linked in: nls_ascii nls_cp437 vfat fat kvm_intel kvm irqbypass efivars ip_tables x_tables sd_mod ahci bnx2x libahci mdio libata firmware_class dm_mirror dm_region_hash dm_log dm_mod efivarfs
[ 2894.003787][ C62] irq event stamp: 55378
[ 2894.022398][ C62] hardirqs last enabled at (55377): [<ffffffff95a00c42>] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 2894.070770][ C62] hardirqs last disabled at (55378): [<ffffffff959b111d>] irqentry_enter+0x1d/0x50
[ 2894.112794][ C62] softirqs last enabled at (50602): [<ffffffff95c0070f>] __do_softirq+0x70f/0xa9f
[ 2894.154580][ C62] softirqs last disabled at (49393): [<ffffffff95a00ec2>] asm_call_on_stack+0x12/0x20
[ 2894.197654][ C62] CPU: 62 PID: 218155 Comm: trinity-c5 Not tainted 5.9.0-rc2-next-20200828+ #4
[ 2894.239807][ C62] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 10/21/2019
[ 2894.284894][ C62] RIP: 0010:iov_iter_copy_from_user_atomic+0x598/0xab0
[ 2894.316037][ C62] Code: 42 0f b6 14 08 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 7c 03 00 00 48 8d 7e 08 8b 46 0c 48 89 fa 48 c1 ea 03 01 e8 <42> 0f b6 14 0a 84 d2 74 09 80 fa 03 0f 8e 0e 03 00 00 44 8b 6e 08
[ 2894.408763][ C62] RSP: 0018:ffffc9000fb7f848 EFLAGS: 00000246
[ 2894.440034][ C62] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888fce295100
[ 2894.480312][ C62] RDX: 1ffff111f9c52a21 RSI: ffff888fce295100 RDI: ffff888fce295108
[ 2894.517551][ C62] RBP: 0000000000000000 R08: fffff52001f6ffa1 R09: dffffc0000000000
[ 2894.556460][ C62] R10: ffff889055c1f000 R11: 0000000000000400 R12: ffffc9000fb7fcf0
[ 2894.593928][ C62] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400
[ 2894.630979][ C62] FS: 00007ff89140a740(0000) GS:ffff88905fd80000(0000) knlGS:0000000000000000
[ 2894.673104][ C62] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2894.702914][ C62] CR2: 00007ff8906242fc CR3: 000000101bb76004 CR4: 00000000003706e0
[ 2894.740628][ C62] DR0: 00007ff891250000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2894.778016][ C62] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 2894.815393][ C62] Call Trace:
[ 2894.830531][ C62] ? shmem_write_begin+0xc4/0x1b0
[ 2894.853643][ C62] generic_perform_write+0x2ac/0x410
[ 2894.878266][ C62] ? filemap_check_errors+0xe0/0xe0
[ 2894.901557][ C62] ? file_update_time+0x215/0x380
[ 2894.925234][ C62] ? update_time+0xa0/0xa0
[ 2894.947683][ C62] ? down_write+0xdb/0x150
[ 2894.969843][ C62] __generic_file_write_iter+0x2fe/0x4f0
[ 2894.995527][ C62] ? rcu_read_unlock+0x50/0x50
[ 2895.017683][ C62] generic_file_write_iter+0x2ee/0x520
[ 2895.042981][ C62] ? __generic_file_write_iter+0x4f0/0x4f0
[ 2895.069612][ C62] ? __mutex_lock+0x4af/0x1390
[ 2895.092240][ C62] do_iter_readv_writev+0x388/0x6f0
[ 2895.115794][ C62] ? lockdep_hardirqs_on_prepare+0x33e/0x4e0
[ 2895.143303][ C62] ? default_llseek+0x240/0x240
[ 2895.165522][ C62] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 2895.190122][ C62] do_iter_write+0x130/0x5f0
[ 2895.211226][ C62] iter_file_splice_write+0x54c/0xa40
[ 2895.235917][ C62] ? page_cache_pipe_buf_try_steal+0x1e0/0x1e0
[ 2895.264415][ C62] ? rcu_read_lock_any_held+0xdb/0x100
[ 2895.290249][ C62] do_splice+0x86c/0x1440
[ 2895.310155][ C62] ? syscall_enter_from_user_mode+0x1b/0x230
[ 2895.337531][ C62] ? direct_splice_actor+0x100/0x100
[ 2895.362348][ C62] __x64_sys_splice+0x151/0x200
[ 2895.384677][ C62] do_syscall_64+0x33/0x40
[ 2895.405223][ C62] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2895.433800][ C62] RIP: 0033:0x7ff890d246ed
[ 2895.456812][ C62] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 57 2c 00 f7 d8 64 89 01 48
[ 2895.553070][ C62] RSP: 002b:00007fff8a6acfb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
[ 2895.592859][ C62] RAX: ffffffffffffffda RBX: 0000000000000113 RCX: 00007ff890d246ed
[ 2895.630744][ C62] RDX: 000000000000015b RSI: 0000000000000000 RDI: 000000000000016b
[ 2895.668870][ C62] RBP: 0000000000000113 R08: 0000000000000400 R09: 0000000000000000
[ 2895.707052][ C62] R10: 00007ff88f30a000 R11: 0000000000000246 R12: 0000000000000002
[ 2895.744716][ C62] R13: 00007ff8913ee058 R14: 00007ff89140a6c0 R15: 00007ff8913ee000