Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present
From: Roman Penyaev
Date: Wed May 31 2017 - 06:17:27 EST
On Tue, May 30, 2017 at 11:09 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
>
> On 30/05/2017 19:35, Roman Penyaev wrote:
>> On Tue, May 30, 2017 at 4:47 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>>>
>>>
>>> On 19/05/2017 18:14, Roman Penyaev wrote:
>>>> 2. A bit complicated, which makes sure the CPL field is preserved across
>>>> KVM_GET/SET_SREGS calls and makes svm_set_segment() and svm_get_segment()
>>>> functionality symmethric:
>>>
>>> I think I prefer this solution.
>>>
>>>> KVM SVM side:
>>>> -------------
>>>>
>>>> --- a/arch/x86/kvm/svm.c
>>>> +++ b/arch/x86/kvm/svm.c
>>>> @@ -1999,7 +1999,7 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
>>>> * would entail passing the CPL to userspace and back.
>>>> */
>>>> if (seg == VCPU_SREG_SS)
>>>> - svm->vmcb->save.cpl = (s->attrib >>
>>>> SVM_SELECTOR_DPL_SHIFT) & 3;
>>>> + svm->vmcb->save.cpl = (var->dpl & 3);
>>>>
>>>> mark_dirty(svm->vmcb, VMCB_SEG);
>>>> }
>>>
>>> I wonder why svm_set_segment is setting s->attrib = 0 at all. The
>>> manual only mentions checking P=0. What about something like:
>>>
>>> s->base = var->base;
>>> s->limit = var->limit;
>>> s->selector = var->selector;
>>> s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
>>> s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
>>> s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
>>> s->attrib |= (var->present && !var->unusable) << SVM_SELECTOR_P_SHIFT;
>>> s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
>>> s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
>>> s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
>>> s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
>>
>> Do we care about compatibility issues? I mean can any old qemu send
>> us "garbage" in other members of 'var' structure if 'var->unused' == 1 ?
>
> That shouldn't matter, the processor shouldn't use them if P=0.
Could you please point me where did you find that? E.g. what I see in
AMD manual 24593âRev. 3.28âMarch 2017, section "Segment State in the VMCB",
top of the page 453:
NOTE: For the Stack Segment attributes, P is observed in legacy and
compatibility mode. In 64-bit mode, P is ignored because all
stack segments are treated as present.
So I am confused.
>> Oh, it seems we require one more field in 'struct kvm_segment' for CPL.
>
> Why? The point is exactly to use SS's var->dpl.
Yes, yes, let's use dpl as it is used now on svm_get_segment().
>
>>>> QEMU side:
>>>> ----------
>>>>
>>>> --- a/target/i386/kvm.c
>>>> +++ b/target/i386/kvm.c
>>>> @@ -1979,6 +1979,8 @@ static int kvm_get_sregs(X86CPU *cpu)
>>>> get_seg(&env->segs[R_FS], &sregs.fs);
>>>> get_seg(&env->segs[R_GS], &sregs.gs);
>>>> get_seg(&env->segs[R_SS], &sregs.ss);
>>>> + if (sregs.ss.unusable)
>>>> + env->segs[R_SS].flags |= sregs.ss.dpl << DESC_DPL_SHIFT;
>>>>
>>>> get_seg(&env->tr, &sregs.tr);
>>>> get_seg(&env->ldt, &sregs.ldt);
>>>
>>> I think what QEMU should do is, in get_seg
>>>
>>> if (rhs->unusable) {
>>> lhs->flags &= ~DESC_P_MASK;
>>>
>>> This would preserve the SS.DPL field. This should still work fine with
>>> QEMU commit 4cae9c9 (the loading side would set lhs->unusable).
>>
>> Indeed, it will preserve the *old* SS.DPL field, but will not take the *new*
>> one from kvm side. And what if extend get_seg() with additional 'segtype'
>> argument:
>
> Or:
>
> lhs->flags = (rhs->type << DESC_TYPE_SHIFT) |
> (rhs->present * DESC_P_MASK) |
> (rhs->dpl << DESC_DPL_SHIFT) |
> (rhs->db << DESC_B_SHIFT) |
> (rhs->s * DESC_S_MASK) |
> (rhs->l << DESC_L_SHIFT) |
> (rhs->g * DESC_G_MASK) |
> (rhs->avl * DESC_AVL_MASK);
> if (rhs->unusable) {
> lhs->flags = 0;
> }
>
> which could also be simply
>
> lhs->flags = ... |
> ((rhs->present && !rhs->unusable) * DESC_P_MASK) | ...;
>
> as in the KVM code.
True. Fully symmetric. So something like that:
Kernel:
-------
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d09bc3e7882c..ecb76d9bf0cb 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1466,6 +1466,7 @@ static void svm_get_segment(struct kvm_vcpu *vcpu,
*/
if (var->unusable)
var->db = 0;
+ /* This is symmetric with svm_set_segment() */
var->dpl = to_svm(vcpu)->vmcb->save.cpl;
break;
}
@@ -1610,18 +1611,14 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
s->base = var->base;
s->limit = var->limit;
s->selector = var->selector;
- if (var->unusable)
- s->attrib = 0;
- else {
- s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
- s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
- s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
- s->attrib |= (var->present & 1) << SVM_SELECTOR_P_SHIFT;
- s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
- s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
- s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
- s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
- }
+ s->attrib = (var->type & SVM_SELECTOR_TYPE_MASK);
+ s->attrib |= (var->s & 1) << SVM_SELECTOR_S_SHIFT;
+ s->attrib |= (var->dpl & 3) << SVM_SELECTOR_DPL_SHIFT;
+ s->attrib |= ((var->present & 1) && !var->unusable) <<
SVM_SELECTOR_P_SHIFT;
+ s->attrib |= (var->avl & 1) << SVM_SELECTOR_AVL_SHIFT;
+ s->attrib |= (var->l & 1) << SVM_SELECTOR_L_SHIFT;
+ s->attrib |= (var->db & 1) << SVM_SELECTOR_DB_SHIFT;
+ s->attrib |= (var->g & 1) << SVM_SELECTOR_G_SHIFT;
/*
* This is always accurate, except if SYSRET returned to a segment
@@ -1630,7 +1627,8 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
* would entail passing the CPL to userspace and back.
*/
if (seg == VCPU_SREG_SS)
- svm->vmcb->save.cpl = (s->attrib >> SVM_SELECTOR_DPL_SHIFT) & 3;
+ /* This is symmetric with svm_get_segment() */
+ svm->vmcb->save.cpl = (var->dpl & 3);
mark_dirty(svm->vmcb, VMCB_SEG);
}
QEMU:
-----
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 011d4a55b136..faee904d9d59 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1300,18 +1300,14 @@ static void get_seg(SegmentCache *lhs, const
struct kvm_segment *rhs)
lhs->selector = rhs->selector;
lhs->base = rhs->base;
lhs->limit = rhs->limit;
- if (rhs->unusable) {
- lhs->flags = 0;
- } else {
- lhs->flags = (rhs->type << DESC_TYPE_SHIFT) |
- (rhs->present * DESC_P_MASK) |
- (rhs->dpl << DESC_DPL_SHIFT) |
- (rhs->db << DESC_B_SHIFT) |
- (rhs->s * DESC_S_MASK) |
- (rhs->l << DESC_L_SHIFT) |
- (rhs->g * DESC_G_MASK) |
- (rhs->avl * DESC_AVL_MASK);
- }
+ lhs->flags = (rhs->type << DESC_TYPE_SHIFT) |
+ ((rhs->present && !rhs->unusable) * DESC_P_MASK) |
+ (rhs->dpl << DESC_DPL_SHIFT) |
+ (rhs->db << DESC_B_SHIFT) |
+ (rhs->s * DESC_S_MASK) |
+ (rhs->l << DESC_L_SHIFT) |
+ (rhs->g * DESC_G_MASK) |
+ (rhs->avl * DESC_AVL_MASK);
}
--
Roman