Re: [BUG] Infinite loop in cleanup_mnt() task_work on 6.3-rc3

From: Al Viro
Date: Tue Feb 06 2024 - 15:50:27 EST


On Tue, Feb 06, 2024 at 11:52:58AM -0800, Calvin Owens wrote:
> Hello all,
>
> A couple times in the past week, my laptop has been wedged by a spinning
> cleanup_mnt() task_work from an exiting container runtime (bwrap).
>
> The first time it reproduced was while writing to dm-crypt on nvme, so I
> blew it off as a manifestation of the tasklet corruption. But I hit it
> again last night on rc3, which contains commit 0a9bab391e33, so that's
> not it.
>
> I'm sorry to say I have very little to go on. Both times it happened, I
> was using Nautilus to browse around in some directories, but I've tried
> monkeying around with that and had no luck reproducing it. The spinning
> happens late enough in the exit path that /proc/self/ is gutted, so I
> don't know what the bwrap container was actually doing.
>
> The NMI stacktrace and the kconfig I'm running are below. The spinning
> task still moves between CPUs. No hung task notifications appear except
> for random sync() calls happening afterwards from userspace, which all
> block on super_lock() in iterate_supers(). Trying to ptrace the stuck
> process hangs also hangs the tracing process forever.
>
> I rebuilt with lockdep this morning, but haven't seen any splats, and
> haven't hit the bug again.
>
> Please let me know if you see anything specific I can test or try that
> might help narrow the problem down. Otherwise, I'll keep working on
> finding a reliable reproducer.

Check if git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #fixes

helps.