Re: [PATCH] static_key: fix concurrent static_key_slow_inc

From: Paolo Bonzini
Date: Wed Jun 22 2016 - 06:04:52 EST




On 22/06/2016 10:50, Christian Borntraeger wrote:
> On 06/21/2016 06:52 PM, Paolo Bonzini wrote:
>> The following scenario is possible:
>>
>> CPU 1 CPU 2
>> static_key_slow_inc
>> atomic_inc_not_zero
>> -> key.enabled == 0, no increment
>> jump_label_lock
>> atomic_inc_return
>> -> key.enabled == 1 now
>> static_key_slow_inc
>> atomic_inc_not_zero
>> -> key.enabled == 1, inc to 2
>> return
>> ** static key is wrong!
>> jump_label_update
>> jump_label_unlock
>>
>> Testing the static key at the point marked by (**) will follow the wrong
>> path for jumps that have not been patched yet. This can actually happen
>> when creating many KVM virtual machines with userspace LAPIC emulation;
>> just run several copies of the following program:
>>
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <sys/ioctl.h>
>> #include <linux/kvm.h>
>>
>> int main(void)
>> {
>> for (;;) {
>> int kvmfd = open("/dev/kvm", O_RDONLY);
>> int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
>> close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
>> close(vmfd);
>> close(kvmfd);
>> }
>> return 0;
>> }
>>
>> Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc. The
>> static key's purpose is to skip NULL pointer checks and indeed one of
>> the processes eventually dereferences NULL.
>
> Interesting. Some time ago I had a spurious bug on the preempt_notifier
> when starting/stopping lots of guests, but I was never able to reliably
> reproduce it. I was chasing some other bug, so I did not even considered
> static_key to be broken, but this might actually be the fix for that
> problem.

It could be the same that was reported here:
http://article.gmane.org/gmane.comp.emulators.kvm.devel/154069

Paolo