Re: possible deadlock in shmem_mfill_atomic_pte

From: Yang Shi
Date: Wed Apr 15 2020 - 22:23:23 EST


On Wed, Apr 15, 2020 at 6:27 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> On Mon, 13 Apr 2020, Yang Shi wrote:
> > On Tue, Mar 31, 2020 at 10:21 AM syzbot
> > <syzbot+e27980339d305f2dbfd9@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 527630fb Merge tag 'clk-fixes-for-linus' of git://git.kern..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1214875be00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=27392dd2975fd692
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=e27980339d305f2dbfd9
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+e27980339d305f2dbfd9@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > WARNING: possible irq lock inversion dependency detected
> > > 5.6.0-rc7-syzkaller #0 Not tainted
> > > --------------------------------------------------------
> > > syz-executor.0/10317 just changed the state of lock:
> > > ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > > ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: shmem_mfill_atomic_pte+0x1012/0x21c0 mm/shmem.c:2407
> > > but this lock was taken by another, SOFTIRQ-safe lock in the past:
> > > (&(&xa->xa_lock)->rlock#5){..-.}
> > >
> > >
> > > and interrupts could create inverse lock ordering between them.
> > >
> > >
> > > other info that might help us debug this:
> > > Possible interrupt unsafe locking scenario:
> > >
> > > CPU0 CPU1
> > > ---- ----
> > > lock(&(&info->lock)->rlock);
> > > local_irq_disable();
> > > lock(&(&xa->xa_lock)->rlock#5);
> > > lock(&(&info->lock)->rlock);
> > > <Interrupt>
> > > lock(&(&xa->xa_lock)->rlock#5);
> > >
> > > *** DEADLOCK ***
> >
> > This looks possible. shmem_mfill_atomic_pte() acquires info->lock with
> > irq enabled.
> >
> > The below patch should be able to fix it:
>
> I agree, thank you: please send to akpm with your signoff and
>
> Reported-by: syzbot+e27980339d305f2dbfd9@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
> Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>
>
> I bet that 4.11 commit was being worked on before 4.8 reversed the
> ordering of info->lock and tree_lock, changing spin_lock(&info->lock)s
> to spin_lock_irq*(&info->lock)s - this one is the only hold-out; and
> not using userfaultfd, I wouldn't have seen the lockdep report.

Thanks, Hugh. I believe this commit could fix the splat. I'm trying to
push my test tree to github to let syzkaller test it. I will send the
formal patch once I get it tested. It is just slow to push to github,
less than 50KB/s...


>
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index d722eb8..762da6a 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2399,11 +2399,11 @@ static int shmem_mfill_atomic_pte(struct
> > mm_struct *dst_mm,
> >
> > lru_cache_add_anon(page);
> >
> > - spin_lock(&info->lock);
> > + spin_lock_irq(&info->lock);
> > info->alloced++;
> > inode->i_blocks += BLOCKS_PER_PAGE;
> > shmem_recalc_inode(inode);
> > - spin_unlock(&info->lock);
> > + spin_unlock_irq(&info->lock);
> >
> > inc_mm_counter(dst_mm, mm_counter_file(page));
> > page_add_file_rmap(page, false);