Re: possible deadlock in sidtab_sid2str_put
From: Ondrej Mosnacek
Date: Tue Jan 28 2020 - 08:51:33 EST
On Tue, Jan 28, 2020 at 2:39 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> On Tue, Jan 28, 2020 at 7:50 AM syzbot
> <syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit: b0be0eff Merge tag 'x86-pti-2020-01-28' of git://git.kerne..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1432aebee00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=9784e57c96a92f20
> > dashboard link: https://syzkaller.appspot.com/bug?extid=61cba5033e2072d61806
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10088e95e00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13fa605ee00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > =====================================================
> > WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > 5.5.0-syzkaller #0 Not tainted
> > -----------------------------------------------------
> > syz-executor305/10624 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
> > ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> >
> > and this task is already holding:
> > ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > which would create a new lock dependency:
> > (&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.}
> >
> > but this new dependency connects a SOFTIRQ-irq-safe lock:
> > (&(&nf_conntrack_locks[i])->rlock){+.-.}
> >
> > ... which became SOFTIRQ-irq-safe at:
> > lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > spin_lock include/linux/spinlock.h:338 [inline]
> > nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
>
> ...
>
> > to a SOFTIRQ-irq-unsafe lock:
> > (&(&s->cache_lock)->rlock){+.+.}
> >
> > ... which became SOFTIRQ-irq-unsafe at:
> > ...
> > lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > spin_lock include/linux/spinlock.h:338 [inline]
> > sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> > sidtab_sid2str_put+0xa0/0xc0 security/selinux/ss/sidtab.c:566
> > sidtab_entry_to_string security/selinux/ss/services.c:1279 [inline]
> > sidtab_entry_to_string+0xf2/0x110 security/selinux/ss/services.c:1266
> > security_sid_to_context_core+0x2c6/0x3c0 security/selinux/ss/services.c:1361
> > security_sid_to_context+0x34/0x40 security/selinux/ss/services.c:1384
> > avc_audit_post_callback+0x102/0x790 security/selinux/avc.c:709
> > common_lsm_audit+0x5ac/0x1e00 security/lsm_audit.c:466
> > slow_avc_audit+0x16a/0x1f0 security/selinux/avc.c:782
> > avc_audit security/selinux/include/avc.h:140 [inline]
> > avc_has_perm+0x543/0x610 security/selinux/avc.c:1185
> > inode_has_perm+0x1a8/0x230 security/selinux/hooks.c:1631
> > selinux_mmap_file+0x10a/0x1d0 security/selinux/hooks.c:3701
> > security_mmap_file+0xa4/0x1e0 security/security.c:1482
> > vm_mmap_pgoff+0xf0/0x230 mm/util.c:502
>
> ...
>
> > other info that might help us debug this:
> >
> > Possible interrupt unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&(&s->cache_lock)->rlock);
> > local_irq_disable();
> > lock(&(&nf_conntrack_locks[i])->rlock);
> > lock(&(&s->cache_lock)->rlock);
> > <Interrupt>
> > lock(&(&nf_conntrack_locks[i])->rlock);
> >
> > *** DEADLOCK ***
> >
> > 4 locks held by syz-executor305/10624:
> > #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnl_lock net/netfilter/nfnetlink.c:62 [inline]
> > #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnetlink_rcv_msg+0x9ee/0xfb0 net/netfilter/nfnetlink.c:224
> > #1: ffff8880836415d8 (nlk_cb_mutex-NETFILTER){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
> > #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > #3: ffffffff8b7df008 (&selinux_ss.policy_rwlock){.+.?}, at: security_sid_to_context_core+0x1ca/0x3c0 security/selinux/ss/services.c:1344
>
> I think this is going to be tricky to fix due to the differing
> contexts from which sidtab_sid2str_put() may be called. We already
> have a check for !in_task() in sidtab_sid2str_put(), do we want to add
> a check for !in_serving_softirq() too?
I don't think that is going to fix it. The issue here is that an
interrupt may happen while we are holding the lock and in that
interrupt we try to take some other lock, which another thread has
taken before the cache lock... I'm afraid using the irqsave/irqrestore
locking is the only possible fix here... (Either that or giving up on
the proper least-recently-used logic.)
--
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.