Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter

From: Mateusz Guzik
Date: Mon Mar 24 2025 - 12:03:30 EST


On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote:
> So far, with tracing, this is where I'm:
>
> o Mainline + Oleg's optimization reverted:
>
> ...
> kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55
> kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55
> kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7
> kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7
> repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector
> repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work
> repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write
> repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write
> repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0
> repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector
> kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work
> kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
> ... (syzbot reproducer continues)
>
> o Mainline:
>
> kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7
> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7
> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55
> kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55
> repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector
> repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work
> repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write
> repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write
> repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0
> repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> ... (syzbot reproducer hangs)
>
> There is a third "kworker/u1030" component that never gets woken up for
> reasons currently unknown to me with Oleg's optimization. I'll keep
> digging.
>

Thanks for the update.

It is unclear to me if you checked, so I'm going to have to ask just
in case: when there is a hang, is there *anyone* stuck in pipe code
(and if so, where)?

You can get the kernel to print stacks for all threads with sysrq:
echo t > /proc/sysrq-trigger

--
Mateusz Guzik <mjguzik gmail.com>