Re: net/sctp: use-after-free in sctp_association_put

From: Xin Long
Date: Fri Mar 03 2017 - 00:38:45 EST


On Fri, Mar 3, 2017 at 3:21 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Thu, Mar 2, 2017 at 9:06 AM, Xin Long <lucien.xin@xxxxxxxxx> wrote:
>> On Thu, Mar 2, 2017 at 3:18 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>> Hello,
>>>
>>> I've got the following report while running syzkaller fuzzer on
>>> linux-next/8813198236a044b76e251dcae937b180dd527999:
>>>
>>> BUG: KASAN: use-after-free in sctp_association_destroy
>>> net/sctp/associola.c:416 [inline] at addr ffff8801c0fa415c
>>> BUG: KASAN: use-after-free in sctp_association_put+0x294/0x300
>>> net/sctp/associola.c:881 at addr ffff8801c0fa415c
>>> Read of size 1 by task syz-executor1/10956
>>> CPU: 1 PID: 10956 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170213 #1
>>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>>> BIOS Google 01/01/2011
>>> Call Trace:
>>> <IRQ>
>>> __dump_stack lib/dump_stack.c:15 [inline]
>>> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>> kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
>>> print_address_description mm/kasan/report.c:200 [inline]
>>> kasan_report_error mm/kasan/report.c:289 [inline]
>>> kasan_report.part.2+0x1e5/0x4b0 mm/kasan/report.c:311
>>> kasan_report mm/kasan/report.c:329 [inline]
>>> __asan_report_load1_noabort+0x29/0x30 mm/kasan/report.c:329
>>> sctp_association_destroy net/sctp/associola.c:416 [inline]
>>> sctp_association_put+0x294/0x300 net/sctp/associola.c:881
>>> sctp_generate_timeout_event+0x115/0x360 net/sctp/sm_sideeffect.c:317
>>> sctp_generate_t1_init_event+0x1a/0x20 net/sctp/sm_sideeffect.c:329
>>> call_timer_fn+0x241/0x820 kernel/time/timer.c:1308
>>> expire_timers kernel/time/timer.c:1348 [inline]
>>> __run_timers+0x9e7/0xe90 kernel/time/timer.c:1642
>>> run_timer_softirq+0x21/0x80 kernel/time/timer.c:1655
>>> __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>>> invoke_softirq kernel/softirq.c:364 [inline]
>>> irq_exit+0x1cc/0x200 kernel/softirq.c:405
>>> exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>>> smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>> apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
>>> RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:788 [inline]
>>> RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
>>> RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:199
>>> RSP: 0018:ffff8801c280f178 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
>>> RAX: dffffc0000000000 RBX: ffff8801dbf24a00 RCX: 0000000000000006
>>> RDX: 1ffffffff0a18d03 RSI: ffff8801d71c88e0 RDI: ffffffff850c6818
>>> RBP: ffff8801c280f180 R08: 0000000000000002 R09: 0000000000000000
>>> R10: 0000000000000006 R11: 0000000000000000 R12: ffff8801c0f3a4c0
>>> R13: 1ffff10038501e38 R14: ffff8801d71c80c0 R15: ffff8801d71c80c0
>>> </IRQ>
>>> finish_lock_switch kernel/sched/sched.h:1248 [inline]
>>> finish_task_switch+0x1c2/0x720 kernel/sched/core.c:2792
>>> context_switch kernel/sched/core.c:2928 [inline]
>>> __schedule+0x893/0x2290 kernel/sched/core.c:3468
>>> preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3579
>>> _cond_resched+0x17/0x20 kernel/sched/core.c:4977
>>> slab_pre_alloc_hook mm/slab.h:427 [inline]
>>> slab_alloc mm/slab.c:3390 [inline]
>>> __do_kmalloc mm/slab.c:3730 [inline]
>>> __kmalloc_track_caller+0x26a/0x690 mm/slab.c:3747
>>> kstrdup+0x39/0x70 mm/util.c:54
>>> snd_timer_instance_new+0xfc/0x5d0 sound/core/timer.c:110
>>> snd_timer_open+0x878/0x1740 sound/core/timer.c:290
>>> snd_timer_user_tselect sound/core/timer.c:1621 [inline]
>>> __snd_timer_user_ioctl sound/core/timer.c:1901 [inline]
>>> snd_timer_user_ioctl+0x9b1/0x34a0 sound/core/timer.c:1931
>>> vfs_ioctl fs/ioctl.c:43 [inline]
>>> do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
>>> SYSC_ioctl fs/ioctl.c:698 [inline]
>>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>> entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x44fb59
>>> RSP: 002b:00007f0dc184db58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 0000000040345410 RCX: 000000000044fb59
>>> RDX: 0000000020001000 RSI: 0000000040345410 RDI: 0000000000000005
>>> RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000708000
>>> R13: 0000000000a5fc57 R14: 00007f0dc184e9c0 R15: 0000000000000000
>>> Object at ffff8801c0fa4140, in cache kmalloc-4096 size: 4096
>>> Allocated:
>>> PID = 10965
>>> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>> save_stack+0x43/0xd0 mm/kasan/kasan.c:504
>>> set_track mm/kasan/kasan.c:516 [inline]
>>> kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:607
>>> kmem_cache_alloc_trace+0x10b/0x670 mm/slab.c:3634
>>> kmalloc include/linux/slab.h:490 [inline]
>>> kzalloc include/linux/slab.h:663 [inline]
>>> sctp_association_new+0x114/0x2120 net/sctp/associola.c:306
>>> sctp_sendmsg+0x1585/0x38f0 net/sctp/socket.c:1835
>>> inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
>>> sock_sendmsg_nosec net/socket.c:633 [inline]
>>> sock_sendmsg+0xca/0x110 net/socket.c:643
>>> ___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
>>> __sys_sendmsg+0x138/0x300 net/socket.c:2019
>>> SYSC_sendmsg net/socket.c:2030 [inline]
>>> SyS_sendmsg+0x2d/0x50 net/socket.c:2026
>>> entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> Freed:
>>> PID = 10965
>>> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>> save_stack+0x43/0xd0 mm/kasan/kasan.c:504
>>> set_track mm/kasan/kasan.c:516 [inline]
>>> kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:580
>>> __cache_free mm/slab.c:3510 [inline]
>>> kfree+0xd3/0x250 mm/slab.c:3827
>>> sctp_association_destroy net/sctp/associola.c:432 [inline]
>>> sctp_association_put+0x20e/0x300 net/sctp/associola.c:881
>>> sctp_association_free+0x635/0x8d0 net/sctp/associola.c:410
>>> sctp_cmd_delete_tcb net/sctp/sm_sideeffect.c:891 [inline]
>>> sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1306 [inline]
>>> sctp_side_effects net/sctp/sm_sideeffect.c:1171 [inline]
>>> sctp_do_sm+0x28a2/0x6900 net/sctp/sm_sideeffect.c:1143
>>> sctp_primitive_SHUTDOWN+0xa9/0xd0 net/sctp/primitive.c:104
>>> sctp_close+0x3c3/0x9d0 net/sctp/socket.c:1530
>>> inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
>>> inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
>>> sock_release+0x8d/0x1e0 net/socket.c:597
>>> sock_close+0x16/0x20 net/socket.c:1061
>>> __fput+0x332/0x7f0 fs/file_table.c:208
>>> ____fput+0x15/0x20 fs/file_table.c:244
>>> task_work_run+0x18a/0x260 kernel/task_work.c:116
>>> exit_task_work include/linux/task_work.h:21 [inline]
>>> do_exit+0x1956/0x2900 kernel/exit.c:873
>>> do_group_exit+0x149/0x420 kernel/exit.c:977
>>> get_signal+0x7e0/0x1820 kernel/signal.c:2313
>>> do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
>>> exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:156
>>> prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>>> syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
>>> entry_SYSCALL_64_fastpath+0xc0/0xc2
>>> Memory state around the buggy address:
>>> ffff8801c0fa4000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ffff8801c0fa4080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>>ffff8801c0fa4100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
>>> ^
>>> ffff8801c0fa4180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff8801c0fa4200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>>
>>>
>>>
>>> Shouldn't sctp_association_free call del_timer_sync instead of del_timer?
>> I think it's safe to use del_timer there, as the timer handler
>> sctp_generate_timeout_event checks asoc->base.dead under
>> sock lock to decide if it will call the event handler.
>>
>> So even if sctp_association_free free the assoc (not destroyed),
>> another timer handler in other CPU will not crash the kernel.
>>
>> The issue here is more like asoc's refcnt <=1 already when T1
>> timer handler was running, somewhere put asoc incorrectly.
>
> Right.
>
>> Hi Dmitry, do you have reproducer and .config for this ?
>
> No. It happened only once and is not reproducible. Most likely this
> race with a very short windows of inconsistency.
In linux-next/8813198236a044b76e251dcae937b180dd527999.
There is one race caused by sctp_assoc_free is called NOT under
the right sock lock:
https://lkml.org/lkml/2017/2/21/688
It would cause a double-free of the asoc as Marcelo said.

I would expect this commit in net.git fixed this issue:

commit dfcb9f4f99f1e9a49e43398a7bfbf56927544af1
Author: Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>
Date: Thu Feb 23 09:31:18 2017 -0300

sctp: deny peeloff operation on asocs with threads sleeping on it

Thanks.

>
> FWIW right before the crash the thread that allocated the object
> (10965) produced:
>
> [ 122.448837] sctp: [Deprecated]: syz-executor3 (pid 10965) Use of
> int in maxseg socket option.
> [ 122.448837] Use struct sctp_assoc_value instead
> [ 122.468168] sctp: [Deprecated]: syz-executor3 (pid 10965) Use of
> int in max_burst socket option.
> [ 122.468168] Use struct sctp_assoc_value instead