Re: RISCV Vector unit disabled by default for new task (was Re: [PATCH v12 17/17] riscv: prctl to enable vector commands)

From: Florian Weimer
Date: Thu Dec 15 2022 - 07:30:01 EST


* Björn Töpel:

>> For SVE, it is in fact disabled by default in the kernel. When a thread
>> executes the first SVE instruction, it will cause an exception, the kernel
>> will allocate memory for SVE state and enable TIF_SVE. Further use of SVE
>> instructions will proceed without exceptions. Although SVE is disabled by
>> default, it is enabled automatically. Since this is done automatically
>> during an exception handler, there is no opportunity for memory allocation
>> errors to be reported, as there are in the AMX case.
>
> Glibc has an SVE optimized memcpy, right? Doesn't that mean that pretty
> much all processes on an SVE capable system will enable SVE (lazily)? If
> so, that's close to "enabled by default" (unless SVE is disabled system
> wide).

Yes, see sysdeps/aarch64/multiarch/memcpy.c:

static inline __typeof (__redirect_memcpy) *
select_memcpy_ifunc (void)
{
INIT_ARCH ();

if (sve && HAVE_AARCH64_SVE_ASM)
{
if (IS_A64FX (midr))
return __memcpy_a64fx;
return __memcpy_sve;
}

if (IS_THUNDERX (midr))
return __memcpy_thunderx;

if (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr))
return __memcpy_thunderx2;

if (IS_FALKOR (midr) || IS_PHECDA (midr))
return __memcpy_falkor;

return __memcpy_generic;
}

And the __memcpy_sve implementation actually uses SVE.

If there were a prctl to select the vector width and enable the vector
extension, we'd have to pick a width in glibc anyway. Likewise for any
other libc, the Go runtime, and so on. That's why I think the kernel is
in a better position to handle this.

> AMX is a bit different from SVE and V; SVE/V is/would be used by glibc
> for memcpy and such, where I doubt that AMX would be used there. Then
> again, there's AVX512 which many argue that "turned on by default" was a
> mistake (ABI breakage/power consumption).

I don't think AMX is useful for string operations or the math functions
currently implemented in glibc.

Not everything in AVX-512 has high power consumption on relevant CPUs.
Furthermore, the extra registers that don't need VZEROUPPER help us to
avoid transactions aborts in RTM mode. If we had to enable AVX-512
explicitly in every process, I'm not sure if we would be using it today.
The complicated choices around AVX-512 (and AVX2 for earlier CPUs)
aren't particularly unique. These functions have different trade-offs
(optimizing for single thread/single process usage vs global system
behavior) on other architectures, too.

> There will likely be V support in glibc (str*/mem*). For systems that
> prefer having V "always-on", the UX of requiring all binaries to
> explicitly call prctl() is not great (as Andrew pointed out in earlier
> posts). A V knob based on some system policy in crt0? :-P

It wouldn't be in crt0 (statically linked), it would be in the dynamic
loader. So not quite as bad if policy revisions are required. But
glibc is not the only provider of userspace startup code, so future
tuning of userspace policy will remain complicated.

Thanks,
Florian