Re: [PATCH v3] lib/raid/xor: x86: Add AVX-512 optimized xor_gen()

From: Eric Biggers

Date: Mon Jun 15 2026 - 16:11:17 EST


On Mon, Jun 15, 2026 at 12:03:38PM -0700, Eric Biggers wrote:
> Note: for now I omitted the cpu_has_xfeatures() check that the AVX-512
> optimized crypto and CRC code does, since it's not implemented on
> User-Mode Linux and it's never been present in the RAID6 code either.

By the way, Sashiko keeps complaining about this decision.

Maybe the x86 maintainers have some advice here?

For context: on x86 processors, executing AVX or AVX512 instructions
requires not just that the CPU supports the feature, but also that the
operating system has set certain bits in XCR0. For example all EVEX
coded instructions (i.e. AVX-512) require XCR0=111xx111b. (See Intel
manual "2.6.11.1 State Dependent #UD".)

Therefore most of the kernel's AVX and AVX512 optimized code checks not
just X86_FEATURE_AVX* but also calls cpu_has_xfeatures() to check XCR0.

But "most" isn't all. The RAID6 code for example doesn't check
cpu_has_xfeatures(). So if you e.g. boot a kernel in QEMU using
"-cpu max,xsave=off", it already crashes when the RAID6 code does its
boot-time benchmark.

Part of the reason for that omission probably is that UML doesn't
provide an implementation of cpu_has_xfeatures(). And the x86 RAID (XOR
and RAID6) code is enabled on UML.

It could be implemented for UML by using the xgetbv instruction, like
what userspace programs do. (We'd also need to copy the XFEATURE_MASK_*
constants, as UML can't include arch/x86/include/asm/fpu/types.h)

But I wanted to ask: do we really care about the case where features are
"supported" but their XCR0 bits aren't set? Perhaps the kernel just
doesn't/shouldn't support weird cases like "-cpu max,xsave=off"?

If this case indeed needs to be handled, could we make things easier for
the kernel's AVX and AVX-512 optimized code? Currently AVX-512 needs:

if (boot_cpu_has(X86_FEATURE_AVX512F) &&
cpu_has_xfeatures(XFEATURE_MASK_FP | XFEATURE_MASK_SSE |
XFEATURE_MASK_YMM | XFEATURE_MASK_AVX512, NULL))

How about we make X86_FEATURE_AVX512F depend on XCR0=111xx111, and
X86_FEATURE_AVX depend on XCR0=xxxxx111? Then the cpu_has_xfeatures()
check wouldn't be needed. Is there any reason not to do that?

- Eric