Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them
From: Borislav Petkov
Date: Tue Apr 25 2017 - 12:07:18 EST
On Tue, Apr 25, 2017 at 05:27:04PM +0200, Denys Vlasenko wrote:
> MSRs in 0xC001102x range (and a few close to this range)
> allow to modify some internal actions of the pipeline.
>
> (There is one non-debug MSR in this range, introduced in Fam15h:
> MSR 0xC0011027 Address Mask For DR0 Breakpoints, aka DR0_ADDR_MASK).
>
> Sometimes these MSRs are used to fix erratas.
>
> Let's have a comment about that.
>
> Lat's use the following naming scheme for all of them: MSR_FnnH_REGNAME
> This introduces some redundant names, but documents CPU family where
> we are reasonably sure a particular register exists, and avoids the need
> to explain why the same register is either "Combined Unit Cfg"
> or "Bus Unit Cfg" - obviously, because the name depends on the CPU family.
>
> Renaming:
> MSR_AMD64_DC_CFG -> MSR_F10H_DC_CFG
> MSR_AMD64_BU_CFG2 -> MSR_F10H_BU_CFG2
> MSR_AMD64_LS_CFG -> MSR_F16H_LS_CFG
> MSR_AMD64_DE_CFG -> MSR_F12H_DE_CFG (and moving to msr-index.h)
>
> Here is a little compilation from about a dozen documents.
>
> C001_1000:
> 15h Errata 608 "P-state Limit Changes May Not Generate Interrupts"
> - worked around by setting bit 16.
> 15h Errata 671 "Debug Breakpoint on Misaligned Store May Cause System Hang"
> - worked around by setting bit 17 to 0.
> - AMD is really reluctant to this workaround, must be painful
> 15h Errata 727 "Processor Core May Hang During CC6 Resume"
> - worked around by setting bit 15.
> - HW fixed in models 10h?
>
> C001_1020:
> K8 Errata 106
> "Potential Deadlock with Tightly Coupled Semaphores in an MP System"
> - worked around by setting bit 25.
> 10h,12h Errata 670
> "Segment Load May Cause System Hang or Fault After State Change"
> - worked around by setting bit 8.
> - this bit has something to do with handling of LOCK prefix.
> 14h Errata 530
> "Potential Violation of Read Ordering Rules Between Semaphore Operation
> and Subsequent Load Operations"
> - worked around by setting bit 36.
> 14h Errata 551
> "Processor May Not Forward Data From Store to a Page Crossing
> Read-Modify-Write Operation"
> - worked around by setting bit 25.
> 14h Errata 560
> "Processor May Incorrectly Forward Data with Non-cacheable Floating-Point
> 128-bit SSE Operation"
> - worked around by setting bit 18.
> 16h Errata 793
> "Specific Combination of Writes to Write Combined Memory Types and Locked
> Instructions May Cause Core Hang"
> - worked around by setting bit 15.
>
> C001_1021:
> K8 Errata 94
> "Sequential Prefetch Feature May Cause Incorrect Processor Operation"
> - worked around by setting bit 11.
> 14h Errata 688
> "Processor May Cause Unpredictable Program Behavior Under Highly Specific
> Branch Conditions"
> - worked around by setting bits 14 and 3.
> 16h Errata 776
> "Incorrect Processor Branch Prediction for Two Consecutive Linear Pages"
> - worked around by setting bit 26.
> - HW fixed in models 30h?
>
> C001_1022:
> K8 Errata 97 "128-Bit Streaming Stores May Cause Coherency Failure"
> - worked around by setting bit 3.
> K8 Errata 81
> "Cache Coherency Problem with Hardware Prefetching and Streaming Stores"
> - worked around by setting bit 10.
> 10h Errata 261
> "Processor May Stall Entering Stop-Grant Due to Pending Data Cache Scrub"
> - worked around by setting bit 24.
> 10h Errata 326 "Misaligned Load Operation May Cause Processor Core Hang"
> - worked around by setting bits 43:42 to 00.
> 10h Errata 383
> "CPU Core May Machine Check When System Software Changes Page Tables
> Dynamically"
> - worked around by setting bit 47.
> 15h Errata 674
> "Processor May Cache Prefetched Data from Remapped Memory Region"
> - worked around by setting bit 13.
>
> C001_1023:
> K8 Errata 69
> "Multiprocessor Coherency Problem with Hardware Prefetch Mechanism"
> - worked around by setting bit 45.
> K8 Errata 113 "Enhanced Write-Combining Feature Causes System Hang"
> - worked around by setting bit 48.
> 10h Errata 254 "Internal Resource Livelock Involving Cached TLB Reload"
> - worked around by setting bit 21.
> 10h Errata 298
> "L2 Eviction May Occur During Processor Operation To Set Accessed or Dirty Bit"
> - worked around by setting bit 1.
> 10h Errata 309
> "Processor Core May Execute Incorrect Instructions on Concurrent L2 and
> Northbridge Response"
> - worked around by setting bit 23.
>
> C001_1029:
> 10h,12h Errata 721 "Processor May Incorrectly Update Stack Pointer"
> - worked around by setting bit 0.
> 12h Errata 665 "Integer Divide Instruction May Cause Unpredictable Behavior"
> - worked around by setting bit 31.
> Bit 23 serializes CLFLUSH instruction.
>
> C001_102A:
> 15h Errata 503 "APIC Task-Priority Register May Be Incorrect"
> - worked around by setting bit 11.
>
> K8_BKDG documents none of these registers, but Revision Guide mentions
> them a lot.
>
> 10h_BKDG documents them as:
> MSRC001_1021 Instruction Cache Configuration Register (IC_CFG)
> MSRC001_1022 Data Cache Configuration (DC_CFG)
> MSRC001_1023 Bus Unit Configuration Register (BU_CFG)
> MSRC001_102A Bus Unit Configuration 2 (BU_CFG2)
>
> 11h_BKDG documents:
> MSRC001_1022 Data Cache Configuration (DC_CFG)
> MSRC001_1023 Bus Unit Configuration Register (BU_CFG)
>
> 12h_BKDG documents:
> MSRC001_1020 Load-Store Configuration (LS_CFG)
> MSRC001_1021 Instruction Cache Configuration (IC_CFG)
> MSRC001_1022 Data Cache Configuration (DC_CFG)
> MSRC001_1029 Decode Configuration (DE_CFG)
> MSRC001_102A Combined Unit Configuration 2 (CU_CFG2) - name change since 10h
>
> 14h_Mod_00h-0Fh_BKDG documents only:
> MSRC001_1020 Load-Store Configuration (LS_CFG)
> MSRC001_1021 Instruction Cache Configuration (IC_CFG)
> MSRC001_1022 Data Cache Configuration (DC_CFG)
>
> 15h_Mod_00h-0Fh_BKDG documents more:
> MSRC001_1020 Load-Store Configuration (LS_CFG)
> MSRC001_1021 Instruction Cache Configuration (IC_CFG)
> MSRC001_1022 Data Cache Configuration (DC_CFG)
> MSRC001_1023 Combined Unit Configuration (CU_CFG) - name change since 11h
> MSRC001_1028 Floating Point Configuration (FP_CFG)
> MSRC001_1029 Decode Configuration (DE_CFG)
> MSRC001_102A Combined Unit Configuration 2 (CU_CFG2)
> MSRC001_102B Combined Unit Configuration 3 (CU_CFG3)
> MSRC001_102C Execution Unit Configuration (EX_CFG)
> MSRC001_102D Load-Store Configuration 2 (LS_CFG2)
>
> 15h_Mod_10h-1Fh_BKDG: does not mention MSRC001_1029 and MSRC001_102C.
>
> 15h_Mod_30h-3Fh_BKDG: does not mention MSRC001_1029, MSRC001_102C
> and MSRC001_102D, adds new one:
> MSRC001_102F Prefetch Throttling Configuration (CU_PFTCFG)
>
> 15h_Mod_60h-6Fh_BKDG: also fails to mention MSRC001_1029, MSRC001_102C
> and MSRC001_102D, but has new ones:
> MSRC001_101C Load-Store Configuration 3 (LS_CFG3)
> MSRC001_1090 Processor Feedback Constants 0
> MSRC001_10A1 Contention Blocking Buffer Control (CU_CBBCFG)
>
> MSRC001_1000 is only mentioned in 15h erratas, name unknown.
>
> 16h_Mod_00h-0Fh_BKDG: stuff disappeared or got renamed
> (1023 and 102A are "Bus Unit Configuration" again):
> MSRC001_1020 Load-Store Configuration (LS_CFG)
> MSRC001_1021 Instruction Cache Configuration (IC_CFG)
> MSRC001_1022 Data Cache Configuration (DC_CFG)
> MSRC001_1023 Bus Unit Configuration (BU_CFG)
> MSRC001_1028 Floating Point Configuration (FP_CFG) - all bits are "reserved"
> MSRC001_102A Bus Unit Configuration 2 (BU_CFG2)
>
> 16h_Mod_30h-3Fh_BKDG: FP_CFG now has one documented field
This "little compilation" of counting all those MSRs is completely
useless and unneeded.
> CC: Ingo Molnar <mingo@xxxxxxxxxx>
> CC: Andy Lutomirski <luto@xxxxxxxxxx>
> CC: Borislav Petkov <bp@xxxxxxxxx>
> CC: Brian Gerst <brgerst@xxxxxxxxx>
> CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CC: "H. Peter Anvin" <hpa@xxxxxxxxxxxxxxx>
> CC: x86@xxxxxxxxxx
> CC: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
> ---
> arch/x86/include/asm/msr-index.h | 63 +++++++++++++++++++++++++++++++++++++---
> arch/x86/kernel/cpu/amd.c | 10 +++----
> arch/x86/kvm/svm.c | 4 +--
> arch/x86/kvm/x86.c | 4 +--
> 4 files changed, 67 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index d8b5f8a..8f89dd3 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -293,9 +293,6 @@
> #define MSR_AMD64_PATCH_LOADER 0xc0010020
> #define MSR_AMD64_OSVW_ID_LENGTH 0xc0010140
> #define MSR_AMD64_OSVW_STATUS 0xc0010141
> -#define MSR_AMD64_LS_CFG 0xc0011020
> -#define MSR_AMD64_DC_CFG 0xc0011022
> -#define MSR_AMD64_BU_CFG2 0xc001102a
> #define MSR_AMD64_IBSFETCHCTL 0xc0011030
> #define MSR_AMD64_IBSFETCHLINAD 0xc0011031
> #define MSR_AMD64_IBSFETCHPHYSAD 0xc0011032
> @@ -315,6 +312,65 @@
> #define MSR_AMD64_IBSOPDATA4 0xc001103d
> #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */
>
> +/*
> + * MSRs in 0xc001101c-0xc001102f range are sparsely documented in BKDGs,
> + * but sometimes they can be found in errata documents.
> + * Registers 1020-1023 exist since K8 (mentioned in errata docs).
> + * Fam10h also has registers 1029, 102a (maybe more, not in docs).
> + * Fam15h BKDGs document registers 1028, 102b-102f, 101c, 1090, 10a1.
> + * Registers 1023 and 102a are called "Combined Unit Cfg" or "Bus Unit Cfg",
> + * depending on the CPU family.
> + */
> +#define MSR_K8_LS_CFG 0xc0011020
> +#define MSR_K8_IC_CFG 0xc0011021
> +#define MSR_K8_DC_CFG 0xc0011022
> +#define MSR_K8_BU_CFG 0xc0011023
> +
> +#define MSR_F10H_LS_CFG 0xc0011020
> +#define MSR_F10H_IC_CFG 0xc0011021
> +#define MSR_F10H_DC_CFG 0xc0011022
> +#define MSR_F10H_BU_CFG 0xc0011023
> +#define MSR_F10H_DE_CFG 0xc0011029
> +#define MSR_F10H_BU_CFG2 0xc001102a
> +
> +#define MSR_F12H_LS_CFG 0xc0011020
> +#define MSR_F12H_IC_CFG 0xc0011021
> +#define MSR_F12H_DC_CFG 0xc0011022
> +#define MSR_F12H_CU_CFG 0xc0011023
> +#define MSR_F12H_DE_CFG 0xc0011029
> +#define MSR_F12H_CU_CFG2 0xc001102a
> +
> +#define MSR_F14H_LS_CFG 0xc0011020
> +#define MSR_F14H_IC_CFG 0xc0011021
> +#define MSR_F14H_DC_CFG 0xc0011022
> +#define MSR_F14H_CU_CFG 0xc0011023
> +#define MSR_F14H_FP_CFG 0xc0011028
> +#define MSR_F14H_DE_CFG 0xc0011029
> +#define MSR_F14H_CU_CFG2 0xc001102a
> +
> +#define MSR_F16H_LS_CFG 0xc0011020
> +#define MSR_F16H_IC_CFG 0xc0011021
> +#define MSR_F16H_DC_CFG 0xc0011022
> +#define MSR_F16H_BU_CFG 0xc0011023
> +#define MSR_F16H_FP_CFG 0xc0011028
> +#define MSR_F16H_DE_CFG 0xc0011029
> +#define MSR_F16H_BU_CFG2 0xc001102a
> +
> +#define MSR_F15H_LS_CFG3 0xc001101c
> +#define MSR_F15H_LS_CFG 0xc0011020
> +#define MSR_F15H_IC_CFG 0xc0011021
> +#define MSR_F15H_DC_CFG 0xc0011022
> +#define MSR_F15H_CU_CFG 0xc0011023
> +#define MSR_F15H_FP_CFG 0xc0011028
> +#define MSR_F15H_DE_CFG 0xc0011029
> +#define MSR_F15H_CU_CFG2 0xc001102a
> +#define MSR_F15H_CU_CFG3 0xc001102b
> +#define MSR_F15H_EX_CFG 0xc001102c
> +#define MSR_F15H_LS_CFG2 0xc001102d
> +#define MSR_F15H_CU_PFTCFG 0xc001102f
> +#define MSR_F15H_CU_PROCFB_SCALE_0 0xc0011090
> +#define MSR_F15H_CU_CBBCFG 0xc00110a1
Pls no. Not every MSR for every family. Only the 4 which are actually
being used. We can't hold in here the full 32-bit MSR space.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.