Re: [PATCH v9 08/26] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically
From: Bae, Chang Seok
Date: Fri Aug 13 2021 - 04:06:10 EST
On Aug 12, 2021, at 12:44, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Fri, Jul 30, 2021 at 07:59:39AM -0700, Chang S. Bae wrote:
>>
>> --- a/arch/x86/include/asm/trace/fpu.h
>> +++ b/arch/x86/include/asm/trace/fpu.h
>> @@ -89,6 +89,11 @@ DEFINE_EVENT(x86_fpu, x86_fpu_xstate_check_failed,
>> TP_ARGS(fpu)
>> );
>>
>> +DEFINE_EVENT(x86_fpu, x86_fpu_xstate_alloc_failed,
>> + TP_PROTO(struct fpu *fpu),
>> + TP_ARGS(fpu)
>
> Last time I said:
>
> "Yes, add it when it is really needed. Not slapping it proactively and
> hoping for any potential usage."
>
> Why is that thing still here?!
There was no clear path to emit the error code before. I thought that’s the
reason for this tracepoint. But now a signal or an error code return is
established. I should have removed it along with the change.
>> + * @mask: This bitmap tells which components reserved in the buffer.
>
> are reserved?
>
> What's this notion of reservation here? The mask is dictating what gets
> reserved in the buffer or what?
>
> Looking at the usage, that mask is simply saying which components are
> going to be saved in the buffer. So all this "reserved" bla is only
> confusing - drop it.
Okay. I remember this “reserved” started from a changelog. With your
confusion, let me also make sure all is removed.
>> + *
>> + * Available once those arrays for the offset, size, and alignment info are
>> + * set up, by setup_xstate_features().
>> + *
>> + * Returns: The buffer size
>> + */
>> +unsigned int get_xstate_size(u64 mask)
>> +{
>> + unsigned int size;
>> + int i, nr;
>> +
>> + if (!mask)
>> + return 0;
>> +
>> + /*
>> + * The minimum buffer size excludes the dynamic user state. When a
>> + * task uses the state, the buffer can grow up to the max size.
>> + */
>> + if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
>> + return get_xstate_config(XSTATE_MIN_SIZE);
>> + else if (mask == xfeatures_mask_all)
>> + return get_xstate_config(XSTATE_MAX_SIZE);
>> +
>> + nr = fls64(mask) - 1;
>> +
>> + if (!boot_cpu_has(X86_FEATURE_XSAVES))
>
> cpu_feature_enabled()
>
>> + return xstate_offsets[nr] + xstate_sizes[nr];
>
> From all the superfluous commenting, where a comment is really needed is
> here but there's none.
>
> What's that doing? No compacted states enabled so take the offset and
> size of the *last* state and use that as the buffer size?
Yes, each state offset in the non-compacted format is fixed in a machine
regardless of RFBM. So, simply take the size like that.
>> +
>> + if ((xfeatures_mask_all & (BIT_ULL(nr + 1) - 1)) == mask)
> ^^^^^^^^^^^^^^^^^^^^^
>
> That thing looks like a GENMASK_ULL() thing. Use it?
Looks like I was not familiar with this macro:
if ((xfeatures_mask_all & GENMASK_ULL(nr, 0)) == mask)
> Also, what is that test doing?!
>
> If a mask up to nr ANDed with mask_all is == mask?!
>
> You need to explain yourself a lot more here what you're doing. Why
> those two special cases if you can simply iterate over the extended
> states and be done with it? Except maybe the first two special cases
> which are trivial...
xstate_comp_offset[] comes from the compacted format with xfeatures_mask_all.
If feature bits are all the same up to ‘nr', this recorded offset can be taken.
But it might be better to simplify this hunk for readability. I suspect its
call sites are not that performance-critical.
>> @@ -848,6 +908,9 @@ void __init fpu__init_system_xstate(void)
>> if (err)
>> goto out_disable;
>>
>> + /* Make sure init_task does not include the dynamic user states. */
>
> My constant review question: why?
Every task’s state_mask should begin as aligned with the default buffer.
fpu_clone() sets this for all, except init_task.
Maybe:
“Make sure init_task’s state_mask aligned with its __default_state"
>> + current->thread.fpu.state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
>> +
>> +/**
>> + * alloc_xstate_buffer - Allocate a buffer with the size calculated from
>
> This name doesn't even begin to tell me that this function deals with
> enlarging the xstate buffer with dynamic states. How is the caller
> supposed to know?
How about enlarge_xstate_buffer() or realloc_xstate_buffer()?
>
> Also, you need to move all possible xfeatures_mask_user_dynamic querying
> inside it so that its user doesn't have to do it. I'm looking at the
> callsite in xstateregs_set().
The query is intended to check whether the xstate buffer is fully expanded or
not -- no need to enlarge.
If the buffer is already the maximum, the code to retrieve XSTATE_BV, this
call, etc should be skipped there.
If the query is moved here, I guess this call site code becomes a bit ugly.
> The other callsite in exc_device_not_available() seems to not check the
> dynamic states but uses only XFD. I guess I'll parse that properly when
> I get there but right now I have no clue why you're not checking the
> dynamic mask there.
In this case, I think it makes sense to move it in this function. But not
clear how well adjust the above case yet.
>> +int alloc_xstate_buffer(struct fpu *fpu, u64 mask)
>> +{
>> + union fpregs_state *state;
>> + unsigned int oldsz, newsz;
>> + u64 state_mask;
>> +
>> + state_mask = fpu->state_mask | mask;
>> +
>> + oldsz = get_xstate_size(fpu->state_mask);
>> + newsz = get_xstate_size(state_mask);
>> +
>> + if (oldsz >= newsz)
>> + return 0;
>
> Why?
>
> Why not simply:
>
> if (fpu->state_mask == mask)
> return 0;
>
> /* vzalloc */
>
> /* free the old buffer */
> free_xstate_buffer(fpu);
>
> fpu->state = state;
> ...
>
> ?
>
> Our FPU code is a mess - you should try not to make it an even bigger
> one without a good reason.
Okay, maybe get_xstate_size() is overkill. But I think a sanity-check like
this:
if ((mask & fpu->state_mask) == mask)
return 0;
>> +
>> + state = vzalloc(newsz);
>> + if (!state) {
>> + /*
>> + * When allocation requested from #NM, the error code may
>> + * not be populated well. Then, this tracepoint is useful
>> + * for providing the failure context.
>> + */
>> + trace_x86_fpu_xstate_alloc_failed(fpu);
>> + return -ENOMEM;
>
> What happens with the old buffer here? It seems we leak it…
No, it is still pointed by fpu->state and will be freed in the exit path.
Thanks,
Chang