Re: [RFC PATCH] x86/fpu/xstate: Add more diagnostic information on inconsistent xstate sizes
From: Fenghua Yu
Date: Tue Apr 11 2023 - 21:21:15 EST
Hi, Chang,
On 4/11/23 09:29, Chang S. Bae wrote:
On 4/10/2023 1:43 PM, Fenghua Yu wrote:
On 4/7/23 11:22, Chang S. Bae wrote:
On 4/5/2023 11:39 AM, Fenghua Yu wrote:
diff --git a/arch/x86/kernel/fpu/xstate.c
b/arch/x86/kernel/fpu/xstate.c
index 0bab497c9436..5f27fcdc6c90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -602,8 +602,37 @@ static bool __init
paranoid_xstate_size_valid(unsigned int kernel_size)
}
}
size = xstate_calculate_size(fpu_kernel_cfg.max_features,
compacted);
- XSTATE_WARN_ON(size != kernel_size,
- "size %u != kernel_size %u\n", size, kernel_size);
+ if (size != kernel_size) {
+ u64 xcr0, ia32_xss;
+
+ XSTATE_WARN_ON(1, "size %u != kernel_size %u\n",
+ size, kernel_size);
+
+ /* Show more information to help diagnose the size issue. */
+ pr_info("x86/fpu: max_features=0x%llx\n",
+ fpu_kernel_cfg.max_features);
+ print_xstate_offset_size();
+ pr_info("x86/fpu: total size: %u bytes\n", size);
+ xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+ if (compacted) {
+ rdmsrl(MSR_IA32_XSS, ia32_xss);
This shouldn't be directly read here because of the LBR state component.
See the function comment:
* Independent XSAVE features allocate their own buffers and are not
* covered by these checks. Only the size of the buffer for task->fpu
* is checked here.
But, isn't that max_features bitmask pretty much about it?
How about getting IA32_XSS from xfeatures_mask_supervisor()? That's
how to get kernel_size by setting IA32_XSS without independent
features in get_xsave_compacted_size()
I think what it tests here is comparing the sizes between the kernel
code and microcode calculations on the same input, which is the
max_features bitmask.
We know that the kernel code calculates the size based on it and also
takes it to write down there -- XCR0 and IA32_XSS. Then, showing that
bitmask looks to be enough I thought, no?
First of all, max_features is shown already.
Kernel_size from CPUID.0xd.0x1:EBX takes XCR0 | IA32_XSS as input.
Platform may take wrong XCR0 or IA32_XSS and get wrong kernel_size. The
purpose of this patch is to provide more debug info to help debug
platform/kernel issue. So instead of a whole max_features, xgetbv() to
get XCR0 and xfeatures_mask_supervisor() to get IA32_XSS provides more
debug info in case platform may have issue in XCR0 or IA32_XSS.
In other words, splitting max_features into XCR0 and IA32_XSS and
showing them individually provide more useful debug info than one single
max_features value.
Does it make sense?
I still expect some acknowledgment of what is coded here for the kernel
calculation details.
The kernel calculation is shown in
+ print_xstate_offset_size();
+ pr_info("x86/fpu: total size: %u bytes\n", size);
Isn't that detailed enough to show offset and size of each xstate and
sum of sizes?
After that,
+ pr_info("x86/fpu: kernel_size from CPUID.0xd.0x%x:EBX: %u bytes\n",
+ compacted ? 1 : 0, kernel_size);
shows how kernel_size is calculated from CPUID?
Using the above debug info, a real platform CPUID issue is shown clearly.
What other details are needed?
Thanks.
-Fenghua