Re: kvm: use-after-free in process_srcu
From: Dmitry Vyukov
Date: Sun Dec 11 2016 - 03:50:30 EST
On Sun, Dec 11, 2016 at 9:40 AM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> On 11 December 2016 at 07:46, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> Hello,
>>
>> I am getting the following use-after-free reports while running
>> syzkaller fuzzer.
>> On commit 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7).
>> Unfortunately it is not reproducible, but all reports look sane and
>> very similar, so I would assume that it is some hard to trigger race.
>> In all cases the use-after-free offset within struct kvm is 344 bytes.
>> This points to srcu field, which starts at 208 with size 360 (I have
>> some debug configs enabled).
> [...]
>> [ 376.024345] [<ffffffff81a77f7e>] __fput+0x34e/0x910 fs/file_table.c:208
>> [ 376.024345] [<ffffffff81a785ca>] ____fput+0x1a/0x20 fs/file_table.c:244
>
> I've been hitting what I think is a struct file refcounting bug which
> causes similar symptoms as you have here (the struct file is freed
> while somebody still has an active reference to it).
>
>> [ 376.024345] [<ffffffff81483c20>] task_work_run+0x1a0/0x280
>> kernel/task_work.c:116
>> [ 376.024345] [< inline >] exit_task_work include/linux/task_work.h:21
>> [ 376.024345] [<ffffffff814129e2>] do_exit+0x1842/0x2650 kernel/exit.c:828
>> [ 376.024345] [<ffffffff814139ae>] do_group_exit+0x14e/0x420 kernel/exit.c:932
>> [ 376.024345] [<ffffffff81442b43>] get_signal+0x663/0x1880
>> kernel/signal.c:2307
>> [ 376.024345] [<ffffffff81239b45>] do_signal+0xc5/0x2190
>> arch/x86/kernel/signal.c:807
>
> Was this or any other process by any chance killed by the OOM killer?
> That seems to be a pattern in the crashes I've seen. If not, do you
> know what killed this process?
Difficult to say as I can't reproduce them.
I've looked at the logs I have and there are no OOM kills, only some
kvm-related messages:
[ 372.188708] kvm [12528]: vcpu0, guest rIP: 0xfff0
kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x2, nop
[ 372.321334] kvm [12528]: vcpu0, guest rIP: 0xfff0 unhandled wrmsr:
0x0 data 0x0
[ 372.426831] kvm [12593]: vcpu512, guest rIP: 0xfff0 unhandled
wrmsr: 0x5 data 0x200
[ 372.646417] irq bypass consumer (token ffff880052f74780)
registration fails: -16
[ 373.001273] pit: kvm: requested 1676 ns i8254 timer period limited
to 500000 ns
[ 375.541449] kvm [13011]: vcpu0, guest rIP: 0x110000 unhandled
wrmsr: 0x0 data 0x2
[ 376.005387] ==================================================================
[ 376.024345] BUG: KASAN: use-after-free in process_srcu+0x27a/0x280
at addr ffff88005e29a418
[ 720.214985] kvm: vcpu 0: requested 244148 ns lapic timer period
limited to 500000 ns
[ 720.271334] kvm: vcpu 0: requested 244148 ns lapic timer period
limited to 500000 ns
[ 720.567985] kvm_vm_ioctl_assign_device: host device not found
[ 721.094589] kvm [22114]: vcpu0, guest rIP: 0x2 unhandled wrmsr: 0x6 data 0x8
[ 723.829467] ==================================================================
[ 723.829467] BUG: KASAN: use-after-free in process_srcu+0x27a/0x280
at addr ffff88005a4d10d8
Logs capture ~3-4 seconds before the crash.
However, syzkaller test processes tend to consume lots of memory from
time to time and cause low memory conditions.
Kills are usually caused by my test driver that kills test processes
after short time.
However, I do see other assorted bugs caused by kvm that are induced
by OOM kills:
https://groups.google.com/d/msg/syzkaller/ytVPh93HLnI/KhZdengZBwAJ