[PATCH v11 23/29] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state

From: Chang S. Bae
Date: Fri Oct 01 2021 - 18:46:02 EST


By default, for XSTATE features in the INIT-state, XSAVE writes zeros to
the uncompressed destination buffer.

E.g., if you are not using AVX-512, you will still get a bunch of zeros on
the signal stack where live AVX-512 data would go.

For permssion-required states (currently AMX state), explicitly skip this
data transfer. The result is that the user buffer for the AMX region will
not be touched by XSAVE.

[ Reading XINUSE takes about 20-30 cycles, but writing zeros consumes about
5-times or more, e.g., for XTILEDATA. ]

Signed-off-by: Chang S. Bae <chang.seok.bae@xxxxxxxxx>
Reviewed-by: Len Brown <len.brown@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
Changes from v10:
* Simplify the sigframe XSAVE code: replace check for XFD STATE with
XTILECFG and later STATE.

Changes from v9:
* Use cpu_feature_enabled() instead of boot_cpu_has(). (Borislav Petkov)

Changes from v5:
* Mentioned the optimization trade-offs in the changelog. (Dave Hansen)
* Added code comment.

Changes from v4:
* Added as a new patch.
---
arch/x86/include/asm/fpu/internal.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 06be4c247c97..5f013fa0b205 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -355,8 +355,12 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
mask = uabi_mask & ~xfeatures_mask_user_perm();

if (sig_xstate_expanded(current)) {
- u64 cur_uabi_mask = uabi_mask & current->thread.fpu.state_mask;
+ u64 cur_uabi_mask;

+ if (cpu_feature_enabled(X86_FEATURE_XGETBV1))
+ cur_uabi_mask = uabi_mask & xgetbv(1);
+ else
+ cur_uabi_mask = uabi_mask & current->thread.fpu.state_mask;
mask |= cur_uabi_mask & xfeatures_mask_user_perm();
}
}
--
2.17.1