Re: [syzbot] INFO: task can't die in blkdev_common_ioctl

From: Tetsuo Handa
Date: Mon Apr 04 2022 - 01:12:48 EST


On 2022/04/04 13:58, Christoph Hellwig wrote:
> On Sat, Apr 02, 2022 at 08:26:23PM +0900, Tetsuo Handa wrote:
>>> git tree: linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=168d578db00000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=be9183de0824e4d7
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=4f1a237abaf14719db49
>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> This is a known problem, and Christoph does not like a mitigation patch.
>> https://lkml.kernel.org/r/YgoQBTzb3b0l1SzP@xxxxxxxxxxxxx
>
> And this report shows why: your patch makes the lock acquisition in
> blkdev_fallocate killable. Which would not help with this problem at
> all, as it does not come through blkdev_fallocate.

My patch proposes filemap_invalidate_lock_killable() and converts only
blkdev_fallocate() case as a starting point. Nothing prevents us from
converting e.g. blk_ioctl_zeroout() case as well. The "not come through
blkdev_fallocate" is bogus.

> The problem is that
> the loop writing actual zeroes to the disk can take forever, and we
> hold the invalidate_lock over that.

There can be several locations which cannot be converted to
filemap_invalidate_lock_killable() (like we had to use mutex_lock(&lo->lo_mutex)
for lo_release() case). But many locations can be converted to
filemap_invalidate_lock_killable() (like we use mutex_lock_killable(&lo->lo_mutex)
for almost all locations).

Being responsive to SIGKILL is good (even if we cannot afford making all locks
killable). I made many locations killable. Therefore, I wonder why not?