RE: general protection fault in fib6_purge_rt

From: Jon Maloy
Date: Thu Mar 21 2019 - 04:53:58 EST




> -----Original Message-----
> From: Xin Long <lucien.xin@xxxxxxxxx>
> Sent: 20-Mar-19 20:09
> To: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>; syzbot
> <syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx>;
> davem@xxxxxxxxxxxxx; kuznet@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; syzkaller-
> bugs@xxxxxxxxxxxxxxxx; tipc-discussion@xxxxxxxxxxxxxxxxxxxxx;
> ying.xue@xxxxxxxxxxxxx; yoshfuji@xxxxxxxxxxxxxx
> Subject: Re: general protection fault in fib6_purge_rt
>
> On Thu, Mar 21, 2019 at 12:54 AM Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > > Sent: 20-Mar-19 17:41
> > > To: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > > Cc: syzbot <syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx>;
> > > davem@xxxxxxxxxxxxx; kuznet@xxxxxxxxxxxxx; linux-
> > > kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; syzkaller-
> > > bugs@xxxxxxxxxxxxxxxx; tipc-discussion@xxxxxxxxxxxxxxxxxxxxx;
> > > ying.xue@xxxxxxxxxxxxx; yoshfuji@xxxxxxxxxxxxxx
> > > Subject: Re: general protection fault in fib6_purge_rt
> > >
> > > On Wed, Mar 20, 2019 at 4:59 PM Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > > wrote:
> > > >
> > > > This one identifies the same culprit as
> > > syzbot+9d4c12bfd45a58738d0a@xxxxxxxxxxxxxxxxxxxxxxxxx, but points to
> > > syzbot+a
> > > different bug.
> > > > That bug has also been fixed, in commit adba75be0d23 ("tipc: fix
> > > > lockdep
> > > warning when reinitilaizing sockets"), applied in 4.20 but not
> > > present in 4.16, - the source of the dump.
> > > > Once again, a dump from 4.20/5.0 might be a help.
> Hi, Jon,
>
> I was running the reproducer against the net.git kernel which includes
> commit adba75be0d23.
>
> Another panic showed up:
>
> [ 156.086487]
> ==========================================================
> ========
> [ 156.088228] BUG: KASAN: use-after-free in
> tipc_disc_timeout+0x9c9/0xb20 [tipc]
> [ 156.089740] Read of size 8 at addr ffff88802fdb1be8 by task swapper/1/0 [
> 156.091120] [ 156.091471] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
> 5.0.0.test.syz #257 [ 156.092873] Hardware name: Red Hat KVM, BIOS
> seabios-1.7.5-8.el7 04/01/2014 [ 156.094315] Call Trace:
> [ 156.094844] <IRQ>
> [ 156.095306] dump_stack+0x7c/0xc0
> [ 156.096040] ? tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.097346]
> print_address_description+0x65/0x22e
> [ 156.098360] ? tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.099408] ?
> tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.100445]
> kasan_report.cold.3+0x37/0x7a [ 156.101348] ?
> tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.102402]
> tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.103641] ?
> tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc] [ 156.104830] ?
> __lock_is_held+0xb4/0x140 [ 156.105669] ? call_timer_fn+0xd1/0x610 [
> 156.106517] call_timer_fn+0x19a/0x610 [ 156.107342] ?
> tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc] [ 156.108538] ?
> timer_fixup_init+0x30/0x30 [ 156.109411] ?
> _raw_spin_unlock_irq+0x29/0x40 [ 156.110343] ?
> tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc] [ 156.111545] ?
> tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc] [ 156.112749]
> run_timer_softirq+0xb51/0x1090 [ 156.113656] ? add_timer+0x8d0/0x8d0 [
> 156.114433] ? kvm_sched_clock_read+0x14/0x30 [ 156.115355] ?
> sched_clock+0x5/0x10 [ 156.116124] __do_softirq+0x236/0xa1c [
> 156.116943] irq_exit+0x281/0x2d0 [ 156.117657]
> smp_apic_timer_interrupt+0x172/0x5d0
> [ 156.118658] apic_timer_interrupt+0xf/0x20
>
>
> I think it's caused by that d->timer wasn't deleted after the netns has been
> destroyed, and tipc_disc_timeout() still used d->net that has been freed.
>
> I looked at the __net_exit path, it should have been done by:
> tipc_exit_net() ->
> tipc_net_stop()->
> tipc_bearer_stop()->
> bearer_disable()->
> tipc_disc_delete()->
> del_timer_sync(&d->timer)
>
> but because of if (!self), it returned in tipc_net_stop().
>
> It seems to me that whether to do tipc_bearer/node_stop() for netns or not
> shouldn't depend on tipc_net(net)->node_addr.
> Can we just remove that if(!self) from tipc_net_stop() to fix it?

That would probably work. Previous to the problematic commit, (!self) just meant that we had never entered
network mode, and that there was nothing to stop or delete. That changed when this patch introduced
the address negotiation period. So, if somebody leaves network mode before the hash address has been set, this will happen.

My concern is that we might run into surprises when we continue into the later functions, such as tipc_bearer_stop(), so I would prefer to avoid that.
The safer approach would be to now instead test for if (!tipc_own_id(net)), which now serves as a safe indicator if we have entered network node or not.

> and also seems tipc_nametbl_stop() will do the clean job for nametbl, should
> tipc_nametbl_withdraw() also be removed from tipc_net_stop()?

Yes. This looks like legacy from the previous implementation.

///jon

>
> diff --git a/net/tipc/net.c b/net/tipc/net.c index f076edb..3647984 100644
> --- a/net/tipc/net.c
> +++ b/net/tipc/net.c
> @@ -163,12 +163,6 @@ void tipc_sched_net_finalize(struct net *net, u32
> addr)
>
> void tipc_net_stop(struct net *net)
> {
> - u32 self = tipc_own_addr(net);
> -
> - if (!self)
> - return;
> -
> - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self);
> rtnl_lock();
> tipc_bearer_stop(net);
> tipc_node_stop(net);
>
> > >
> > >
> > > Looking at the bisection log maybe this reproducer triggers multiple
> > > kernel bugs.
> >
> > I think so.
> >
> > > All crashes including the latest ones and other info are always
> > > available on the dashboard.
> >
> > Looking at the latest dashboard reports, I don't see anything that points to
> TIPC.
> >
> > ///jon
> >
> >
> > >
> > >
> > > > ///jon
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: syzbot
> > > <syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx>
> > > > > Sent: 18-Mar-19 08:28
> > > > > To: davem@xxxxxxxxxxxxx; Jon Maloy <jon.maloy@xxxxxxxxxxxx>;
> > > > > kuznet@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > > > netdev@xxxxxxxxxxxxxxx; syzkaller-bugs@xxxxxxxxxxxxxxxx; tipc-
> > > > > discussion@xxxxxxxxxxxxxxxxxxxxx; ying.xue@xxxxxxxxxxxxx;
> > > > > yoshfuji@linux- ipv6.org
> > > > > Subject: Re: general protection fault in fib6_purge_rt
> > > > >
> > > > > syzbot has bisected this bug to:
> > > > >
> > > > > commit 52dfae5c85a4c1078e9f1d5e8947d4a25f73dd81
> > > > > Author: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > > > > Date: Thu Mar 22 19:42:52 2018 +0000
> > > > >
> > > > > tipc: obtain node identity from interface by default
> > > > >
> > > > > bisection log:
> > > https://syzkaller.appspot.com/x/bisect.txt?x=1116d2a3200000
> > > > > start commit: 52dfae5c tipc: obtain node identity from interface by
> > > defa..
> > > > > git tree: linux-next
> > > > > final crash:
> > > https://syzkaller.appspot.com/x/report.txt?x=1316d2a3200000
> > > > > console output:
> > > > > https://syzkaller.appspot.com/x/log.txt?x=1516d2a3200000
> > > > > kernel config:
> > > > > https://syzkaller.appspot.com/x/.config?x=c8b6073d992e8217
> > > > > dashboard link:
> > > > > https://syzkaller.appspot.com/bug?extid=a25307ad099309f1c2b9
> > > > > syz repro:
> > > https://syzkaller.appspot.com/x/repro.syz?x=16b2c56f200000
> > > > > C reproducer:
> > > https://syzkaller.appspot.com/x/repro.c?x=13b8890b200000
> > > > >
> > > > > Reported-by:
> > > > > syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > > Fixes: 52dfae5c ("tipc: obtain node identity from interface by
> > > > > default")
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google
> > > Groups "syzkaller-bugs" group.
> > > > To unsubscribe from this group and stop receiving emails from it,
> > > > send an
> > > email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> > > > To view this discussion on the web visit
> > > https://groups.google.com/d/msgid/syzkaller-
> > >
> bugs/BL0PR1501MB20039998B662DCC11E2B38D79A410%40BL0PR1501MB200
> > > 3.namprd15.prod.outlook.com.
> > > > For more options, visit https://groups.google.com/d/optout.