Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present
From: Paolo Bonzini
Date: Wed May 31 2017 - 06:50:57 EST
----- Original Message -----
> From: "Roman Penyaev" <roman.penyaev@xxxxxxxxxxxxxxxx>
> To: "Paolo Bonzini" <pbonzini@xxxxxxxxxx>
> Cc: "Mikhail Sennikovskii" <mikhail.sennikovskii@xxxxxxxxxxxxxxxx>, "Gleb Natapov" <gleb@xxxxxxxxxx>,
> kvm@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
> Sent: Wednesday, May 31, 2017 12:17:01 PM
> Subject: Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present
>
> On Tue, May 30, 2017 at 11:09 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> >
> >
> > On 30/05/2017 19:35, Roman Penyaev wrote:
> >> On Tue, May 30, 2017 at 4:47 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx>
> >> wrote:
> >>>
> >>>
> >>> On 19/05/2017 18:14, Roman Penyaev wrote:
> >>>> 2. A bit complicated, which makes sure the CPL field is preserved across
> >>>> KVM_GET/SET_SREGS calls and makes svm_set_segment() and
> >>>> svm_get_segment()
> >>>> functionality symmethric:
> >>>
> >>> I think I prefer this solution.
> >>>
> >>>> KVM SVM side:
> >>>> -------------
> >>>>
> >>>> --- a/arch/x86/kvm/svm.c
> >>>> +++ b/arch/x86/kvm/svm.c
> >>>> @@ -1999,7 +1999,7 @@ static void svm_set_segment(struct kvm_vcpu
> >>>> *vcpu,
> >>>> * would entail passing the CPL to userspace and back.
> >>>> */
> >>>> if (seg == VCPU_SREG_SS)
> >>>> - svm->vmcb->save.cpl = (s->attrib >>
> >>>> SVM_SELECTOR_DPL_SHIFT) & 3;
> >>>> + svm->vmcb->save.cpl = (var->dpl & 3);
> >>>>
> >>>> mark_dirty(svm->vmcb, VMCB_SEG);
> >>>> }
> >>>
> >>> I wonder why svm_set_segment is setting s->attrib = 0 at all. The
> >>> manual only mentions checking P=0. What about something like:
> >>>
> >>> s->base = var->base;
> >>> s->limit = var->limit;
> >>> s->selector = var->selector;
> >>> s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
> >>> s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
> >>> s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
> >>> s->attrib |= (var->present && !var->unusable) <<
> >>> SVM_SELECTOR_P_SHIFT;
> >>> s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
> >>> s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
> >>> s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
> >>> s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
> >>
> >> Do we care about compatibility issues? I mean can any old qemu send
> >> us "garbage" in other members of 'var' structure if 'var->unused' == 1 ?
> >
> > That shouldn't matter, the processor shouldn't use them if P=0.
>
> Could you please point me where did you find that? E.g. what I see in
> AMD manual 24593âRev. 3.28âMarch 2017, section "Segment State in the VMCB",
> top of the page 453:
>
> NOTE: For the Stack Segment attributes, P is observed in legacy and
> compatibility mode. In 64-bit mode, P is ignored because all
> stack segments are treated as present.
You're right and in fact the same applies to unusable=1 on Intel. But
on the other hand, if the garbage got there somehow (e.g. via SMM) it's
the right thing to use it.
> True. Fully symmetric. So something like that:
>
> Kernel:
> -------
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d09bc3e7882c..ecb76d9bf0cb 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1466,6 +1466,7 @@ static void svm_get_segment(struct kvm_vcpu *vcpu,
> */
> if (var->unusable)
> var->db = 0;
> + /* This is symmetric with svm_set_segment() */
> var->dpl = to_svm(vcpu)->vmcb->save.cpl;
> break;
> }
> @@ -1610,18 +1611,14 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
> s->base = var->base;
> s->limit = var->limit;
> s->selector = var->selector;
> - if (var->unusable)
> - s->attrib = 0;
> - else {
> - s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
> - s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
> - s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
> - s->attrib |= (var->present & 1) << SVM_SELECTOR_P_SHIFT;
> - s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
> - s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
> - s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
> - s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
> - }
> + s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
> + s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
> + s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
> + s->attrib |= ((var->present & 1) && !var->unusable) <<
> SVM_SELECTOR_P_SHIFT;
> + s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
> + s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
> + s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
> + s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
>
> /*
> * This is always accurate, except if SYSRET returned to a segment
> @@ -1630,7 +1627,8 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
> * would entail passing the CPL to userspace and back.
> */
> if (seg == VCPU_SREG_SS)
> - svm->vmcb->save.cpl = (s->attrib >> SVM_SELECTOR_DPL_SHIFT) &
> 3;
> + /* This is symmetric with svm_get_segment() */
> + svm->vmcb->save.cpl = (var->dpl & 3);
>
> mark_dirty(svm->vmcb, VMCB_SEG);
> }
>
>
> QEMU:
> -----
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 011d4a55b136..faee904d9d59 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1300,18 +1300,14 @@ static void get_seg(SegmentCache *lhs, const
> struct kvm_segment *rhs)
> lhs->selector = rhs->selector;
> lhs->base = rhs->base;
> lhs->limit = rhs->limit;
> - if (rhs->unusable) {
> - lhs->flags = 0;
> - } else {
> - lhs->flags = (rhs->type << DESC_TYPE_SHIFT) |
> - (rhs->present * DESC_P_MASK) |
> - (rhs->dpl << DESC_DPL_SHIFT) |
> - (rhs->db << DESC_B_SHIFT) |
> - (rhs->s * DESC_S_MASK) |
> - (rhs->l << DESC_L_SHIFT) |
> - (rhs->g * DESC_G_MASK) |
> - (rhs->avl * DESC_AVL_MASK);
> - }
> + lhs->flags = (rhs->type << DESC_TYPE_SHIFT) |
> + ((rhs->present && !rhs->unusable) * DESC_P_MASK) |
> + (rhs->dpl << DESC_DPL_SHIFT) |
> + (rhs->db << DESC_B_SHIFT) |
> + (rhs->s * DESC_S_MASK) |
> + (rhs->l << DESC_L_SHIFT) |
> + (rhs->g * DESC_G_MASK) |
> + (rhs->avl * DESC_AVL_MASK);
> }
Yes, I think both are the right thing to do.
Paolo