Re: [PATCH v6-fix 06/26] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes
From: Dave Hansen
Date: Fri Jul 02 2021 - 13:46:44 EST
On 7/2/21 8:17 AM, Chang S. Bae wrote:
> The XSTATE per-task buffer is currently embedded into struct fpu with
> static size. To accommodate dynamic user XSTATEs, record the maximum and
> minimum buffer sizes.
>
> Rename the size calculation function. It calculates the maximum XSTATE size
> and sanity checks it with CPUID. It also calculates the static embedded
> buffer size by excluding the dynamic user states from the maximum size.
This is missing a few things for me. I'd probably start with this instead:
The CPUID instruction separately enumerates sizes and alignments
of individual xfeatures. It independently enumerates the
required size of an entire XSAVE buffer to store all enabled
features.
calculate_xstate_sizes() currently uses the individual feature
size/alignment enumeration to independently recalculate the
required XSAVE buffer size. This is compared against the CPUID-
provided value.
Dynamic xstates introduce another wrinkle: with them, the
'struct fpu' buffer no longer contains all of the enabled state
components. This means that there are now two separate sizes:
1. The CPUID-enumerated all-feature "maxmimum" size
2. The size of the 'struct fpu' inline buffer
... then go on to explain how you added the #2 calculations
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index e03853bb2603..75969f1ef4b3 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -590,24 +590,36 @@ static void check_xstate_against_struct(int nr)
> }
> }
>
> -/*
> - * This essentially double-checks what the cpu told us about
> - * how large the XSAVE buffer needs to be. We are recalculating
> - * it to be safe.
> +/**
> + * calculate_xstate_sizes() - Calculate the xstate per-task buffer sizes.
> + *
> + * Record the minimum and double-check the maximum against what the CPU
> + * told.
> *
> * Independent XSAVE features allocate their own buffers and are not
> * covered by these checks. Only the size of the buffer for task->fpu
> * is checked here.
> + *
> + * Dynamic user states are stored in this per-task buffer. They account
> + * for the delta between the maximum and the minimum.
> + *
> + * Return: nothing.
> */
I'm not sure "Return: nothing" is worth saying. The "void" kinda gives
it away.
> -static void do_extra_xstate_size_checks(void)
> +static void calculate_xstate_sizes(void)
> {
> - int paranoid_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> - int i;
> + int xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> + int xstate_size_no_dynamic, i;
Both the changelog and the function description talk about the maximum
and minimum sizes of the xsave buffer. Unfortunately, that doesn't seem
to be reflected in the naming at all.
> + xstate_size_no_dynamic = xstate_size;
>
> for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
> + bool user_dynamic;
> +
> if (!xfeature_enabled(i))
> continue;
>
> + user_dynamic = (xfeatures_mask_user_dynamic & BIT_ULL(i)) ? true : false;
> +
> check_xstate_against_struct(i);
> /*
> * Supervisor state components can be managed only by
> @@ -617,27 +629,39 @@ static void do_extra_xstate_size_checks(void)
> XSTATE_WARN_ON(xfeature_is_supervisor(i));
>
> /* Align from the end of the previous feature */
> - if (xfeature_is_aligned(i))
> - paranoid_xstate_size = ALIGN(paranoid_xstate_size, 64);
> + if (xfeature_is_aligned(i)) {
> + xstate_size = ALIGN(xstate_size, 64);
> + if (!user_dynamic)
> + xstate_size_no_dynamic = ALIGN(xstate_size_no_dynamic, 64);
> + }
> /*
> * The offset of a given state in the non-compacted
> * format is given to us in a CPUID leaf. We check
> * them for being ordered (increasing offsets) in
> * setup_xstate_features(). XSAVES uses compacted format.
> */
> - if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
> - paranoid_xstate_size = xfeature_uncompacted_offset(i);
> + if (!cpu_feature_enabled(X86_FEATURE_XSAVES)) {
> + xstate_size = xfeature_uncompacted_offset(i);
> + if (!user_dynamic)
> + xstate_size_no_dynamic = xfeature_uncompacted_offset(i);
> + }
> /*
> * The compacted-format offset always depends on where
> * the previous state ended.
> */
> - paranoid_xstate_size += xfeature_size(i);
> + xstate_size += xfeature_size(i);
> + if (!user_dynamic)
> + xstate_size_no_dynamic += xfeature_size(i);
> }
I'm not a super big fan of how that loop turned out. It's trying to
keep two variables in parallel.
How about making calculate_xstate_sizes() *return* the xstate size, and
pass 'user_dynamic' to it:
/* 'false' to exclude dynamic states: */
xstate_size_min = calculate_xstate_sizes(false);
/* 'true' to include dynamic states: */
xstate_size_max = calculate_xstate_sizes(true);
... then go on to do the checks.
> /*
> * The size accounts for all the possible states reserved in the
> * per-task buffer. Check against the maximum size.
> */
> - XSTATE_WARN_ON(paranoid_xstate_size != get_xstate_config(XSTATE_MAX_SIZE));
> + XSTATE_WARN_ON(xstate_size != get_xstate_config(XSTATE_MAX_SIZE));
> +
> + /* Record the one without dynamic states as the minimum. */
> + set_xstate_config(XSTATE_MIN_SIZE, xstate_size_no_dynamic);
> }
>
>
> @@ -738,14 +762,11 @@ static int __init init_xstate_size(void)
> */
> set_xstate_config(XSTATE_MAX_SIZE, possible_xstate_size);
>
> - /* Perform an extra check for the maximum size. */
> - do_extra_xstate_size_checks();
> -
> /*
> - * Set the minimum to be the same as the maximum. The dynamic
> - * user states are not supported yet.
> + * Calculate and double-check the maximum size. Calculate and record
> + * the minimum size.
> */
> - set_xstate_config(XSTATE_MIN_SIZE, possible_xstate_size);
> + calculate_xstate_sizes();
>
> /* Ensure the minimum size fits in the statically-allocated buffer: */
> if (!is_supported_xstate_size(get_xstate_config(XSTATE_MIN_SIZE)))
>