RE: [PATCH v2 2/3] KVM:SVM: Add extended intercept support

From: Babu Moger
Date: Wed Jun 17 2020 - 14:12:03 EST


Jim,

> -----Original Message-----
> From: kvm-owner@xxxxxxxxxxxxxxx <kvm-owner@xxxxxxxxxxxxxxx> On Behalf
> Of Babu Moger
> Sent: Wednesday, June 17, 2020 9:31 AM
> To: Jim Mattson <jmattson@xxxxxxxxxx>
> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>;
> the arch/x86 maintainers <x86@xxxxxxxxxx>; Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>;
> Borislav Petkov <bp@xxxxxxxxx>; H . Peter Anvin <hpa@xxxxxxxxx>; Paolo
> Bonzini <pbonzini@xxxxxxxxxx>; Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>;
> Thomas Gleixner <tglx@xxxxxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx>;
> kvm list <kvm@xxxxxxxxxxxxxxx>
> Subject: RE: [PATCH v2 2/3] KVM:SVM: Add extended intercept support
>
>
>
> > -----Original Message-----
> > From: Jim Mattson <jmattson@xxxxxxxxxx>
> > Sent: Tuesday, June 16, 2020 6:17 PM
> > To: Moger, Babu <Babu.Moger@xxxxxxx>
> > Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>;
> > the arch/x86 maintainers <x86@xxxxxxxxxx>; Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>;
> > Borislav Petkov <bp@xxxxxxxxx>; H . Peter Anvin <hpa@xxxxxxxxx>; Paolo
> > Bonzini <pbonzini@xxxxxxxxxx>; Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>;
> > Thomas Gleixner <tglx@xxxxxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx>;
> > kvm list <kvm@xxxxxxxxxxxxxxx>
> > Subject: Re: [PATCH v2 2/3] KVM:SVM: Add extended intercept support
> >
> > On Tue, Jun 16, 2020 at 3:03 PM Babu Moger <babu.moger@xxxxxxx>
> wrote:
> > >
> > > The new intercept bits have been added in vmcb control
> > > area to support the interception of INVPCID instruction.
> > >
> > > The following bit is added to the VMCB layout control area
> > > to control intercept of INVPCID:
> > >
> > > Byte Offset Bit(s) Function
> > > 14h 2 intercept INVPCID
> > >
> > > Add the interfaces to support these extended interception.
> > > Also update the tracing for extended intercepts.
> > >
> > > AMD documentation for INVPCID feature is available at "AMD64
> > > Architecture Programmerâs Manual Volume 2: System Programming,
> > > Pub. 24593 Rev. 3.34(or later)"
> > >
> > > The documentation can be obtained at the links below:
> > > Link:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.a
> >
> md.com%2Fsystem%2Ffiles%2FTechDocs%2F24593.pdf&amp;data=02%7C01%7
> >
> Cbabu.moger%40amd.com%7C4cedcb3567194883601e08d8124b6be7%7C3dd8
> >
> 961fe4884e608e11a82d994e183d%7C0%7C0%7C637279463210520563&amp;s
> >
> data=oRQq0hj0O43A4lnl8JEb%2BHt8oCFHWxcqvLaA1%2BacTJc%3D&amp;reser
> > ved=0
> > > Link:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> >
> kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=02%7C01%7Cbabu.m
> >
> oger%40amd.com%7C4cedcb3567194883601e08d8124b6be7%7C3dd8961fe48
> >
> 84e608e11a82d994e183d%7C0%7C0%7C637279463210520563&amp;sdata=EtA
> > rCUBB8etloN%2B%2Blx42RZqai12QFvtJefnxBn1ryMQ%3D&amp;reserved=0
> >
> > Not your change, but this documentation is terrible. There is no
> > INVLPCID instruction, nor is there a PCID instruction.
>
> Sorry about that. I will bring this to their notice.
>
> >
> > > Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> > > ---
> > > arch/x86/include/asm/svm.h | 3 ++-
> > > arch/x86/kvm/svm/nested.c | 6 +++++-
> > > arch/x86/kvm/svm/svm.c | 1 +
> > > arch/x86/kvm/svm/svm.h | 18 ++++++++++++++++++
> > > arch/x86/kvm/trace.h | 12 ++++++++----
> > > 5 files changed, 34 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> > > index 8a1f5382a4ea..62649fba8908 100644
> > > --- a/arch/x86/include/asm/svm.h
> > > +++ b/arch/x86/include/asm/svm.h
> > > @@ -61,7 +61,8 @@ struct __attribute__ ((__packed__))
> vmcb_control_area {
> > > u32 intercept_dr;
> > > u32 intercept_exceptions;
> > > u64 intercept;
> > > - u8 reserved_1[40];
> > > + u32 intercept_extended;
> > > + u8 reserved_1[36];
> >
> > It seems like a more straightforward implementation would simply
> > change 'u64 intercept' to 'u32 intercept[3].'
>
> Sure. Will change it.

This involves much more changes than I originally thought. All these
following code needs to be modified. Here is my cscope output for the C
symbol intercept.

0 nested.c recalc_intercepts 123 c->intercept = h->intercept;
1 nested.c recalc_intercepts 135 c->intercept &= ~(1ULL <<
INTERCEPT_VINTR);
2 nested.c recalc_intercepts 139 c->intercept &= ~(1ULL <<
INTERCEPT_VMMCALL);
3 nested.c recalc_intercepts 144 c->intercept |= g->intercept;
4 nested.c copy_vmcb_control_area 153 dst->intercept =
from->intercept;
5 nested.c nested_svm_vmrun_msrpm 186 if
(!(svm->nested.ctl.intercept & (1ULL << INTERCEPT_MSR_PROT)))
6 nested.c nested_vmcb_check_controls 212 if ((control->intercept & (1ULL
<< INTERCEPT_VMRUN)) == 0)NIT));
7 nested.c nested_svm_vmrun 436
nested_vmcb->control.intercept);
8 nested.c nested_svm_exit_handled_msr 648 if
(!(svm->nested.ctl.intercept & (1ULL << INTERCEPT_MSR_PROT)))
9 nested.c nested_svm_intercept_ioio 675 if
(!(svm->nested.ctl.intercept & (1ULL << INTERCEPT_IOIO_PROT)))
a nested.c nested_svm_intercept 732 if
(svm->nested.ctl.intercept & exit_bits)
b nested.c nested_exit_on_init 840 return
(svm->nested.ctl.intercept & (1ULL << INTERCEPT_INIT));
c svm.c check_selective_cr0_intercepted 2205 u64 intercept;
d svm.c check_selective_cr0_intercepted 2207 intercept =
svm->nested.ctl.intercept;
e svm.c check_selective_cr0_intercepted 2210 (!(intercept & (1ULL <<
INTERCEPT_SELECTIVE_CR0))))
f svm.c dump_vmcb 2803 pr_err("%-20s%016llx\n",
"intercepts:", control->intercept);
m svm.c svm_check_intercept 3687 intercept =
svm->nested.ctl.intercept;
n svm.c svm_check_intercept 3689 if (!(intercept & (1ULL <<
INTERCEPT_SELECTIVE_CR0)))
6 svm.c svm_apic_init_signal_blocked 3948
(svm->vmcb->control.intercept & (1ULL << INTERCEPT_INIT));
7 svm.h set_intercept 300 vmcb->control.intercept |=
(1ULL << bit);
8 svm.h clr_intercept 309 vmcb->control.intercept &=
~(1ULL << bit);
9 svm.h is_intercept tercept_ioio 316 return
(svm->vmcb->control.intercept & (1ULL << bit)) != 0;
a svm.h nested_exit_on_smi 377 return
(svm->nested.ctl.intercept & (1ULL << INTERCEPT_SMI));
b svm.h nested_exit_on_intr 382 return
(svm->nested.ctl.intercept & (1ULL << INTERCEPT_INTR));
c svm.h nested_exit_on_nmi 387 return
(svm->nested.ctl.intercept & (1ULL << INTERCEPT_NMI));

I will have to test extensively if I go ahead with these changes. What do
you think?

>
> >
> > > u16 pause_filter_thresh;
> > > u16 pause_filter_count;
> > > u64 iopm_base_pa;
> > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > index 8a6db11dcb43..7f6d0f2533e2 100644
> > > --- a/arch/x86/kvm/svm/nested.c
> > > +++ b/arch/x86/kvm/svm/nested.c
> > > @@ -121,6 +121,7 @@ void recalc_intercepts(struct vcpu_svm *svm)
> > > c->intercept_dr = h->intercept_dr;
> > > c->intercept_exceptions = h->intercept_exceptions;
> > > c->intercept = h->intercept;
> > > + c->intercept_extended = h->intercept_extended;
> > >
> > > if (g->int_ctl & V_INTR_MASKING_MASK) {
> > > /* We only want the cr8 intercept bits of L1 */
> > > @@ -142,6 +143,7 @@ void recalc_intercepts(struct vcpu_svm *svm)
> > > c->intercept_dr |= g->intercept_dr;
> > > c->intercept_exceptions |= g->intercept_exceptions;
> > > c->intercept |= g->intercept;
> > > + c->intercept_extended |= g->intercept_extended;
> > > }
> > >
> > > static void copy_vmcb_control_area(struct vmcb_control_area *dst,
> > > @@ -151,6 +153,7 @@ static void copy_vmcb_control_area(struct
> > vmcb_control_area *dst,
> > > dst->intercept_dr = from->intercept_dr;
> > > dst->intercept_exceptions = from->intercept_exceptions;
> > > dst->intercept = from->intercept;
> > > + dst->intercept_extended = from->intercept_extended;
> > > dst->iopm_base_pa = from->iopm_base_pa;
> > > dst->msrpm_base_pa = from->msrpm_base_pa;
> > > dst->tsc_offset = from->tsc_offset;
> > > @@ -433,7 +436,8 @@ int nested_svm_vmrun(struct vcpu_svm *svm)
> > > trace_kvm_nested_intercepts(nested_vmcb->control.intercept_cr &
> > 0xffff,
> > > nested_vmcb->control.intercept_cr >> 16,
> > > nested_vmcb->control.intercept_exceptions,
> > > - nested_vmcb->control.intercept);
> > > + nested_vmcb->control.intercept,
> > > + nested_vmcb->control.intercept_extended);
> > >
> > > /* Clear internal status */
> > > kvm_clear_exception_queue(&svm->vcpu);
> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index 9e333b91ff78..285e5e1ff518 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -2801,6 +2801,7 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
> > > pr_err("%-20s%04x\n", "dr_write:", control->intercept_dr >> 16);
> > > pr_err("%-20s%08x\n", "exceptions:", control->intercept_exceptions);
> > > pr_err("%-20s%016llx\n", "intercepts:", control->intercept);
> > > + pr_err("%-20s%08x\n", "intercepts (extended):", control-
> > >intercept_extended);
> > > pr_err("%-20s%d\n", "pause filter count:", control->pause_filter_count);
> > > pr_err("%-20s%d\n", "pause filter threshold:",
> > > control->pause_filter_thresh);
> > > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > index 6ac4c00a5d82..935d08fac03d 100644
> > > --- a/arch/x86/kvm/svm/svm.h
> > > +++ b/arch/x86/kvm/svm/svm.h
> > > @@ -311,6 +311,24 @@ static inline void clr_intercept(struct vcpu_svm
> *svm,
> > int bit)
> > > recalc_intercepts(svm);
> > > }
> > >
> > > +static inline void set_extended_intercept(struct vcpu_svm *svm, int bit)
> > > +{
> > > + struct vmcb *vmcb = get_host_vmcb(svm);
> > > +
> > > + vmcb->control.intercept_extended |= (1U << bit);
> > > +
> > > + recalc_intercepts(svm);
> > > +}
> > > +
> > > +static inline void clr_extended_intercept(struct vcpu_svm *svm, int bit)
> > > +{
> > > + struct vmcb *vmcb = get_host_vmcb(svm);
> > > +
> > > + vmcb->control.intercept_extended &= ~(1U << bit);
> > > +
> > > + recalc_intercepts(svm);
> > > +}
> >
> > You wouldn't need these new functions if you defined 'u32
> > intercept[3],' as I suggested above. Just change set_intercept and
> > clr_intercept to use __set_bit and __clear_bit.
>
> Yes. Will change it. Thanks
>
> >
> > > static inline bool is_intercept(struct vcpu_svm *svm, int bit)
> > > {
> > > return (svm->vmcb->control.intercept & (1ULL << bit)) != 0;
> > > diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> > > index b66432b015d2..5c841c42b33d 100644
> > > --- a/arch/x86/kvm/trace.h
> > > +++ b/arch/x86/kvm/trace.h
> > > @@ -544,14 +544,16 @@ TRACE_EVENT(kvm_nested_vmrun,
> > > );
> > >
> > > TRACE_EVENT(kvm_nested_intercepts,
> > > - TP_PROTO(__u16 cr_read, __u16 cr_write, __u32 exceptions, __u64
> > intercept),
> > > - TP_ARGS(cr_read, cr_write, exceptions, intercept),
> > > + TP_PROTO(__u16 cr_read, __u16 cr_write, __u32 exceptions, __u64
> > intercept,
> > > + __u32 extended),
> > > + TP_ARGS(cr_read, cr_write, exceptions, intercept, extended),
> > >
> > > TP_STRUCT__entry(
> > > __field( __u16, cr_read )
> > > __field( __u16, cr_write )
> > > __field( __u32, exceptions )
> > > __field( __u64, intercept )
> > > + __field( __u32, extended )
> > > ),
> > >
> > > TP_fast_assign(
> > > @@ -559,11 +561,13 @@ TRACE_EVENT(kvm_nested_intercepts,
> > > __entry->cr_write = cr_write;
> > > __entry->exceptions = exceptions;
> > > __entry->intercept = intercept;
> > > + __entry->extended = extended;
> > > ),
> > >
> > > - TP_printk("cr_read: %04x cr_write: %04x excp: %08x intercept:
> %016llx",
> > > + TP_printk("cr_read: %04x cr_write: %04x excp: %08x intercept:
> %016llx"
> > > + "intercept (extended): %08x",
> > > __entry->cr_read, __entry->cr_write, __entry->exceptions,
> > > - __entry->intercept)
> > > + __entry->intercept, __entry->extended)
> > > );
> > > /*
> > > * Tracepoint for #VMEXIT while nested
> > >