Re: [Qemu-devel] [PATCH v1 RFC 3/6] KVM: s390: use facilities and cpu_id per KVM
From: Michael Mueller
Date: Mon May 19 2014 - 07:29:48 EST
On Mon, 19 May 2014 12:41:45 +0200
Alexander Graf <agraf@xxxxxxx> wrote:
>
> On 19.05.14 12:13, Michael Mueller wrote:
> > On Fri, 16 May 2014 22:35:34 +0200
> > Alexander Graf <agraf@xxxxxxx> wrote:
> >
> >> On 16.05.14 18:09, Michael Mueller wrote:
> >>> On Fri, 16 May 2014 16:49:37 +0200
> >>> Alexander Graf <agraf@xxxxxxx> wrote:
> >>>
> >>>> On 16.05.14 16:46, Michael Mueller wrote:
> >>>>> On Fri, 16 May 2014 13:55:41 +0200
> >>>>> Alexander Graf <agraf@xxxxxxx> wrote:
> >>>>>
> >>>>>> On 13.05.14 16:58, Michael Mueller wrote:
> >>>>>>> The patch introduces facilities and cpu_ids per virtual machine.
> >>>>>>> Different virtual machines may want to expose different facilities and
> >>>>>>> cpu ids to the guest, so let's make them per-vm instead of global.
> >>>>>>>
> >>>>>>> In addition this patch renames all ocurrences of *facilities to *fac_list
> >>>>>>> smilar to the already exiting symbol stfl_fac_list in lowcore.
> >>>>>>>
> >>>>>>> Signed-off-by: Michael Mueller <mimu@xxxxxxxxxxxxxxxxxx>
> >>>>>>> Acked-by: Cornelia Huck <cornelia.huck@xxxxxxxxxx>
> >>>>>>> Reviewed-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> >>>>>>> ---
> >>>>>>> arch/s390/include/asm/kvm_host.h | 7 +++
> >>>>>>> arch/s390/kvm/gaccess.c | 4 +-
> >>>>>>> arch/s390/kvm/kvm-s390.c | 107 +++++++++++++++++++++++++++------------
> >>>>>>> arch/s390/kvm/kvm-s390.h | 23 +++++++--
> >>>>>>> arch/s390/kvm/priv.c | 13 +++--
> >>>>>>> 5 files changed, 113 insertions(+), 41 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> >>>>>>> index 38d487a..b4751ba 100644
> >>>>>>> --- a/arch/s390/include/asm/kvm_host.h
> >>>>>>> +++ b/arch/s390/include/asm/kvm_host.h
> >>>>>>> @@ -414,6 +414,12 @@ struct kvm_s390_config {
> >>>>>>> struct kvm_s390_attr_name name;
> >>>>>>> };
> >>>>>>>
> >>>>>>> +struct kvm_s390_cpu_model {
> >>>>>>> + unsigned long *sie_fac;
> >>>>>>> + struct cpuid cpu_id;
> >>>>>>> + unsigned long *fac_list;
> >>>>>>> +};
> >>>>>>> +
> >>>>>>> struct kvm_arch{
> >>>>>>> struct sca_block *sca;
> >>>>>>> debug_info_t *dbf;
> >>>>>>> @@ -427,6 +433,7 @@ struct kvm_arch{
> >>>>>>> wait_queue_head_t ipte_wq;
> >>>>>>> struct kvm_s390_config *cfg;
> >>>>>>> spinlock_t start_stop_lock;
> >>>>>>> + struct kvm_s390_cpu_model model;
> >>>>>>> };
> >>>>>>>
> >>>>>>> #define KVM_HVA_ERR_BAD (-1UL)
> >>>>>>> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
> >>>>>>> index db608c3..4c7ca40 100644
> >>>>>>> --- a/arch/s390/kvm/gaccess.c
> >>>>>>> +++ b/arch/s390/kvm/gaccess.c
> >>>>>>> @@ -358,8 +358,8 @@ static unsigned long guest_translate(struct kvm_vcpu *vcpu, unsigned
> >>>>>>> long gva, union asce asce;
> >>>>>>>
> >>>>>>> ctlreg0.val = vcpu->arch.sie_block->gcr[0];
> >>>>>>> - edat1 = ctlreg0.edat && test_vfacility(8);
> >>>>>>> - edat2 = edat1 && test_vfacility(78);
> >>>>>>> + edat1 = ctlreg0.edat && test_kvm_facility(vcpu->kvm, 8);
> >>>>>>> + edat2 = edat1 && test_kvm_facility(vcpu->kvm, 78);
> >>>>>>> asce.val = get_vcpu_asce(vcpu);
> >>>>>>> if (asce.r)
> >>>>>>> goto real_address;
> >>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> >>>>>>> index 01a5212..a53652f 100644
> >>>>>>> --- a/arch/s390/kvm/kvm-s390.c
> >>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
> >>>>>>> @@ -1,5 +1,5 @@
> >>>>>>> /*
> >>>>>>> - * hosting zSeries kernel virtual machines
> >>>>>>> + * Hosting zSeries kernel virtual machines
> >>>>>>> *
> >>>>>>> * Copyright IBM Corp. 2008, 2009
> >>>>>>> *
> >>>>>>> @@ -30,7 +30,6 @@
> >>>>>>> #include <asm/pgtable.h>
> >>>>>>> #include <asm/nmi.h>
> >>>>>>> #include <asm/switch_to.h>
> >>>>>>> -#include <asm/facility.h>
> >>>>>>> #include <asm/sclp.h>
> >>>>>>> #include<asm/timex.h>
> >>>>>>> #include "kvm-s390.h"
> >>>>>>> @@ -92,15 +91,33 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
> >>>>>>> { NULL }
> >>>>>>> };
> >>>>>>>
> >>>>>>> -unsigned long *vfacilities;
> >>>>>>> -static struct gmap_notifier gmap_notifier;
> >>>>>>> +/* upper facilities limit for kvm */
> >>>>>>> +unsigned long kvm_s390_fac_list_mask[] = {
> >>>>>>> + 0xff82fff3f47c2000UL,
> >>>>>>> + 0x005c000000000000UL,
> >>>>>>> +};
> >>>>>>> +
> >>>>>>> +unsigned long kvm_s390_fac_list_mask_size(void)
> >>>>>>> +{
> >>>>>>> + BUILD_BUG_ON(ARRAY_SIZE(kvm_s390_fac_list_mask) >
> >>>>>>> + S390_ARCH_FAC_MASK_SIZE_U64);
> >>>>>>> + return ARRAY_SIZE(kvm_s390_fac_list_mask);
> >>>>>>> +}
> >>>>>>>
> >>>>>>> -/* test availability of vfacility */
> >>>>>>> -int test_vfacility(unsigned long nr)
> >>>>>>> +void kvm_s390_apply_fac_list_mask(unsigned long fac_list[])
> >>>>>>> {
> >>>>>>> - return __test_facility(nr, (void *) vfacilities);
> >>>>>>> + unsigned int i;
> >>>>>>> +
> >>>>>>> + for (i = 0; i < S390_ARCH_FAC_LIST_SIZE_U64; i++) {
> >>>>>>> + if (i < kvm_s390_fac_list_mask_size())
> >>>>>>> + fac_list[i] &= kvm_s390_fac_list_mask[i];
> >>>>>>> + else
> >>>>>>> + fac_list[i] &= 0UL;
> >>>>>>> + }
> >>>>>>> }
> >>>>>>>
> >>>>>>> +static struct gmap_notifier gmap_notifier;
> >>>>>>> +
> >>>>>>> /* Section: not file related */
> >>>>>>> int kvm_arch_hardware_enable(void *garbage)
> >>>>>>> {
> >>>>>>> @@ -485,6 +502,30 @@ long kvm_arch_vm_ioctl(struct file *filp,
> >>>>>>> return r;
> >>>>>>> }
> >>>>>>>
> >>>>>>> +/* make sure the memory used for fac_list is zeroed */
> >>>>>>> +void kvm_s390_get_hard_fac_list(unsigned long *fac_list, int size)
> >>>>>> Hard? Wouldn't "host" make more sense here?
> >>>>> Renaming "*hard_fac_list" with "*host_fac_list" here and wherever it appears is ok with
> >>>>> me.
> >>>>>
> >>>>>> I also think it makes sense to expose the native host facility list to
> >>>>>> user space via an ioctl somehow.
> >>>>>>
> >>>>> In which situation do you need the full facility list. Do you have an example?
> >>>> If you want to just implement -cpu host to get the exact feature set
> >>>> that the host gives you, how do you know which set that is?
> >>> During qemu machine initalization I call:
> >>>
> >>> kvm_s390_get_machine_props(&mach);
> >>>
> >>> which returns the following information:
> >>>
> >>> typedef struct S390MachineProps {
> >>> uint64_t cpuid;
> >>> uint32_t ibc_range;
> >>> uint8_t pad[4];
> >>> uint64_t fac_mask[S390_ARCH_FAC_MASK_SIZE_UINT64];
> >>> uint64_t hard_fac_list[S390_ARCH_FAC_LIST_SIZE_UINT64];
> >>> uint64_t soft_fac_list[S390_ARCH_FAC_LIST_SIZE_UINT64];
> >>> } S390MachineProps;
> >> Ah, ok, I missed that part. So "kvm_s390_get_machine_props()" basically
> >> gets us the full facility list the host supports. That makes sense.
> >>
> >> I still don't know whether we should care about hard/soft, but I suppose
> >> the rest makes sense.
> > The reason why I differentiate between hardware and software implemented
> > facilities has several reasons.
> >
> > First, one should understand that both lists represent the same set of
> > features, there is no software feature not being implemented as hardware
> > feature on an existing S390 system. An optional implementation of a software
> > feature is usable on back-level hardware to implement functionality, not
> > quality or performance! That means, on OS and it's application is capable to
> > run most efficiently on the given hardware without that software feature.
> > If a system has the same facility in soft- and hardware implemented, KVM
> > always privileges the hardware implementation.
> >
> > Second, separating both features allow user space to differentiate between
> > hardware and software implemented features and to give up on SW features
> > if not explicitly requested by the CPU model option "+sofl".
>
> Why? User space has a set of features it wants to expose to the guest.
You are talking about a different interface here. That is not the intension
of software / hardware facilities of KVM. They are never implemented in user
space!
> If the kernel can handle all of them, fine. If it can't, it can either
> decide that the VM can't run on that host or disable features and
> *remember it did so* so that on migration it can warn the target host.
>
> > Third, and very important, the main reason for CPU model implementation
> > is to guarantee an identical environment on source and target side comprised
> > of facilities, instruction set and CPU id. The capability for migration is
>
> Yes, exactly :).
>
> > lost as soon software implemented facilities become activated!
>
> Why? When something is done in software, I can surely migrate the state?
Sure, but in the former case it could be implemented in the source in HW and
the target in SW.
>
> > Once again the method Qemu uses to define the set of requested facilities.
> >
> > The machine properties ioctl provides FMh, HFh, and SFh, ie. the mask and
> > hardware/software implemented facility sets of the hosts.
> >
> > Qemu has knowledge on all real existing CPU models (TYPE-GA), particularly
> > on the facility set of these model (Fm)
> >
> > During startup, Qemu retrives FMh, HFh and SFh. Then for all CPU models Qemu
> > is aware off, it calculates (Fm & FMh). If that is a subset of HFh, the model
> > is a supportable model on this host. If the user has specified to allow
> > software implemented facilities (-cpu <model>,+sofl), The facilities to
> > request are calculated like (Fm & FMh | SFh).
>
> Why so complicated? What's the difference between -cpu
> z9,cpu-id=z10,+featureA,+featureB,+featureC where A, B and C are the new
> features between z9 and z10?
I absolutely want to avoid a facility room of 2**14 bit being specified on the
command line by an administrator who is even unaware of facilities at all!
Also libvirt must not be scared by this facility flood when making migration
decisions. This is what I think would make things complicate.
I consider an soft defined KVM facility as exception and not the rule.
Michael
>
> There really shouldn't be any difference between those 2 different
> invocation types IMHO.
>
>
> Alex
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/