Re: [PATCH 4/5] KVM: add __kvm_request_needs_mb
From: Christian Borntraeger
Date: Fri Feb 17 2017 - 05:19:24 EST
On 02/17/2017 11:13 AM, David Hildenbrand wrote:
>>> This is really complicated stuff, and the basic reason for it (if I
>>> remember correctly) is that s390x does reenable all interrupts when
>>> entering the sie (see kvm-s390.c:__vcpu_run()). So the fancy smp-based
>>> kicks don't work (as it is otherwise just racy), and if I remember
>>> correctly, SMP reschedule signals (s390x external calls) would be
>>> slower. (Christian, please correct me if I'm wrong)
>> No the reason was that there are some requests that need to be handled
>> outside run SIE. For example one reason was the guest prefix page.
>> This must be mapped read/write ALL THE TIME when a guest is running,
>> otherwise the host might crash. So we have to exit SIE and make sure that
>> it does not reenter, therefore we use the RELOAD_MMU request from a notifier
>> that is called from page table functions, whenever memory management decides
>> to unmap/write protect (dirty pages tracking, reference tracking, page migration
>> or compaction...)
>> SMP-based request wills kick out the guest, but for some thing like the
>> one above it will be too late.
> While what you said is 100% correct, I had something else in mind that
> hindered using vcpu_kick() and especially kvm_make_all_cpus_request().
> And I remember that being related to how preemption and
> OUTSIDE_GUEST_MODE is handled. I think this boils down to what would
> have to be implemented in kvm_arch_vcpu_should_kick().
> x86 can track the guest state using vcpu->mode, because they can be sure
> that the guest can't reschedule while in the critical guest entry/exit
> section. This is not true for s390x, as preemption is enabled. That's
> why vcpu->mode cannot be used in its current form to track if a VCPU is
> in/oustide/exiting guest mode. And kvm_make_all_cpus_request() currently
> relies on this setting.
> For now, calling vcpu_kick() on s390x will result in a BUG().
Yes, you are right, preemption and interrupt handling certainly made it
hard to switch to the common code function. The prefix page oddity was
the original reason for implementing the s390 special code since
we realized that even mlocking and pinning did not guarantee that
the page table stays writable in the page table.
> On s390x, there are 3 use cases I see for requests:
> 1. Remote requests that need a sync
> Make a request, wait until SIE has been left and make sure the request
> will be processed before re-entering the SIE. e.g. KVM_REQ_RELOAD_MMU
> notifier in mmu notifier you mentioned. Also KVM_REQ_DISABLE_IBS is a
> 2. Remote requests that don't need a sync
> E.g. KVM_REQ_ENABLE_IBS doesn't strictly need it, while
> KVM_REQ_DISABLE_IBS does.
> 3. local requests
> E.g. KVM_REQ_TLB_FLUSH from kvm_s390_set_prefix()
> Of course, having a unified interface would be better.
Yes, we tried to be as common as possible. This enabled us to
use things like halt polling.
Radims idea of a weak function could work out, have not checked yet.
> /* set the request and kick the CPU out of guest mode */
> kvm_set_request(req, vcpu);
> /* set the request, kick the CPU out of guest mode, wait until guest
> mode has been left and make sure the request will be handled before
> reentering guest mode */
> kvm_set_sync_request(req, vcpu);
> Same maybe even for multiple VCPUs (as there are then ways to speed it
> up, e.g. first kick all, then wait for all)
> This would require arch specific callbacks to
> 1. pre announce the request (e.g. set PROG_REQUEST on s390x)
> 2. kick the cpu (e.g. CPUSTAT_STOP_INT and later
> kvm_s390_vsie_kick(vcpu) on s390x)
> 3. check if still executing the guest (e.g. PROG_IN_SIE on s390x)
> This would only make sense if there are other use cases for sync
> requests. At least I remember that Power also has a faster way for
> kicking VCPUs, not involving SMP rescheds. I can't judge if this is a
> s390x only thing and is better be left as is :)
> At least vcpu_kick() could be quite easily made to work on s390x.
> Radim, are there also other users that need something like sync requests?
>>> So this statement, is at least from a s390x point of view wrong. The
>>> kvm_vcpu_kick() function would have to be rerouted to an appropriate
>>> s390x implementation (or that whole smp and OUTSIDE_GUEST_MODE stuff
>>> would have to be factored out).