Re: possible deadlock in aio_poll

From: Miklos Szeredi
Date: Mon Sep 10 2018 - 14:14:25 EST


On Mon, Sep 10, 2018 at 6:53 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> On Mon, Sep 10, 2018 at 12:41:05AM -0700, syzbot wrote:
>> =====================================================
>> WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
>> 4.19.0-rc2+ #229 Not tainted
>> -----------------------------------------------------
>> syz-executor2/9399 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
>> 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: spin_lock
>> include/linux/spinlock.h:329 [inline]
>> 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: aio_poll+0x760/0x1420
>> fs/aio.c:1747
>>
>> and this task is already holding:
>> 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq
>> include/linux/spinlock.h:354 [inline]
>> 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll+0x738/0x1420
>> fs/aio.c:1746
>> which would create a new lock dependency:
>> (&(&ctx->ctx_lock)->rlock){..-.} -> (&ctx->fd_wqh){+.+.}
>
> ctx->fd_wqh seems to only exist in userfaultfd, which indeed seems
> to do strange open coded waitqueue locking, and seems to fail to disable
> irqs. Something like this should fix it:

Why do pollable waitqueues need to disable interrupts generally?

I don't see anything fundamental in the poll interface to force this
requirement on users of that interface.

Thanks,
Miklos