Re: [RFC] fuse: do not generate interrupt requests for fatal signals
From: Miklos Szeredi
Date: Thu Jun 13 2024 - 03:56:51 EST
On Thu, 13 Jun 2024 at 06:02, Haifeng Xu <haifeng.xu@xxxxxxxxxx> wrote:
>
> When the child reaper of a pid namespace exits, it invokes
> zap_pid_ns_processes() to send SIGKILL to all processes in the
> namespace and wait them exit. But one of the child processes get
> stuck and its call trace like this:
>
> [<0>] request_wait_answer+0x132/0x210 [fuse]
> [<0>] fuse_simple_request+0x1a8/0x2e0 [fuse]
> [<0>] fuse_flush+0x193/0x1d0 [fuse]
> [<0>] filp_close+0x34/0x70
> [<0>] close_fd+0x38/0x50
> [<0>] __x64_sys_close+0x12/0x40
> [<0>] do_syscall_64+0x59/0xc0
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
Which process is this?
In my experience such lockups are caused by badly written fuse servers.
> The flags of fuse request is (FR_ISREPLY | FR_FORCE | FR_WAITING
> | FR_INTERRUPTED | FR_SENT). For interrupt requests, fuse_dev_do_write()
> doesn't invoke fuse_request_end() to wake the client thread, so it will
> get stuck forever and the child reaper can't exit.
>
> In order to write reply to the client thread and make it exit the
> namespace, so do not generate interrupt requests for fatal signals.
Interrupt request must be generated for all signals. Not generating
them for SIGKILL would break existing filesystems.
Thanks,
Miklos