Re: [PATCH 29/43] KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT

From: Reiji Watanabe
Date: Sun May 23 2021 - 19:04:27 EST


> > AMD's APM Vol2 (Table 14-1 in Revision 3.37) says CR0 After INIT will be:
> >
> > CD and NW are unchanged
> > Bit 4 (reserved) = 1
> > All others = 0
> >
> > (CR0 will be 0x60000010 after RESET)
> >
> > So, it looks the CR0 value that init_vmcb() sets could be
> > different from what is indicated in the APM for INIT.
> >
> > BTW, Intel's SDM (April 2021 version) says CR0 for Power up/Reset/INIT
> > will be 0x60000010 with the following note.
> > -------------------------------------------------
> > The CD and NW flags are unchanged,
> > bit 4 is set to 1, all other bits are cleared.
> > -------------------------------------------------
> > The note is attached as '2' to all Power up/Reset/INIT cases
> > looking at the SDM. I would guess it is erroneous that
> > the note is attached to Power up/Reset though.
>
> Agreed. I'll double check that CD and NW are preserved by hardware on INIT,
> and will also ping Intel folks to fix the POWER-UP and RESET footnote.
>
> Hah! Reading through that section yet again, there's another SDM bug. It
> contradicts itself with respect to the TLBs after INIT.
>
> 9.1 INITIALIZATION OVERVIEW:
> The major difference is that during an INIT, the internal caches, MSRs,
> MTRRs, and x87 FPU state are left unchanged (although, the TLBs and BTB
> are invalidated as with a hardware reset)
>
> while Table 9-1 says:
>
> Register Power up Reset INIT
> Data and Code Cache, TLBs: Invalid[6] Invalid[6] Unchanged
>
> I'm pretty sure that Intel CPUs are supposed to flush the TLB, i.e. Tabel 9-1 is
> wrong. Back in my Intel validation days, I remember being involved in a Core2
> bug that manifested as a triple fault after INIT due to global TLB entries not
> being flushed. Looks like that wasn't fixed:
>
> https://www.intel.com/content/dam/support/us/en/documents/processors/mobile/celeron/sb/320121.pdf
>
> AZ28. INIT Does Not Clear Global Entries in the TLB
> Problem: INIT may not flush a TLB entry when:
> • The processor is in protected mode with paging enabled and the page global enable
> flag is set (PGE bit of CR4 register)
> • G bit for the page table entry is set
> • TLB entry is present in TLB when INIT occurs
> • Software may encounter unexpected page fault or incorrect address translation due
> to a TLB entry erroneously left in TLB after INIT.
>
> Workaround: Write to CR3, CR4 (setting bits PSE, PGE or PAE) or CR0 (setting
> bits PG or PE) registers before writing to memory early in BIOS
> code to clear all the global entries from TLB.
>
> Status: For the steppings affected, see the Summary Tables of Changes.
>
> AMD's APM also appears to contradict itself, though that depends on one's
> interpretation of "external intialization". Like the SDM, its table states that
> the TLBs are not flushed on INIT:
>
> Table 14-1. Initial Processor State
>
> Processor Resource Value after RESET Value after INIT
> Instruction and Data TLBs Invalidated Unchanged
>
> but a blurb later on says:
>
> 5.5.3 TLB Management
>
> Implicit Invalidations. The following operations cause the entire TLB to be
> invalidated, including global pages:
>
> • External initialization of the processor.

"Table 8-9. Simultaneous Interrupt Priorities" of AMD's APM has
the words "External Processor Initialization (INIT)", which make
me guess "the External initialization of the processor" in 5.5.3
TLB Management means INIT.


> All in all, that means KVM also has a bug in the form of a missing guest TLB
> flush on INIT, at least for VMX and probably for SVM. I'll add a patch to flush
> the guest TLBs on INIT irrespective of vendor. Even if AMD CPUs don't flush the
> TLB, I see no reason to bank on all guests being paranoid enough to flush the
> TLB immediately after INIT.

Yes, I agree that would be better.
Thank you so much for all the helpful information !

Regards,
Reiji