Re: RISCV Vector unit disabled by default for new task (was Re: [PATCH v12 17/17] riscv: prctl to enable vector commands)

From: Palmer Dabbelt
Date: Fri Dec 09 2022 - 01:27:34 EST


On Thu, 08 Dec 2022 21:16:06 PST (-0800), Vineet Gupta wrote:
Hi Darius, Andrew, Palmer

On 9/21/22 14:43, Chris Stillson wrote:
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c

@@ -134,7 +135,6 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
if (WARN_ON(!vstate->datap))
return;
}
- regs->status |= SR_VS_INITIAL;


Perhaps not obvious from the patch, but this is a major user experience
change: As in V unit would be turned off for a new task and we will rely
on a userspace prctl (also introduced in this patch) to enable V.

IMO that's the only viable option: enabling V adds more user-visible state, which is a uABI break. I haven't really had time to poke through all the versions here, but I'd have the call look something like

prctl(RISCV_ENABLE_V, min_vlenb, max_vlenb, flags);

where

* min_vlenb is the smallest VLENB that userspace can support. There's alreday an LLVM argument for this, I haven't dug into the generated code but I assume it'll blow up on smaller VLENB systems somehow.
* max_vlenb is the largest VLENB that userspace can support.
* flags is just a placeholder for now, with 0 meaning "V as defined by 1.0 for all threads in this proces". That should give us an out if something more complicated happens in the future.

That way VLA code can call `prctl(RISCV_ENABLE_V, 128, 8192, 0)` as it supports any V 1.0 implementation, while code with other constraints can avoid having V turned on in an unsupported configuration.

I think we can start out with no flags, but there's a few I could see being useful already:

* Cross process/thread enabling. I think a reasonable default is "enable V for all current and future threads in this process", but one could imagine flags for "just this thread" vs "all current threads", a default for new threads, and a default for child processes. I don't think it matters so much what we pick as a default, just that it's written down.
* Setting the VLENB bounds vs updating them. I'm thinking for shared libraries, where they'd only want to enable V in the shared library if it's already in a supported configuration. I'm not sure what the right rules are here, but again it's best to write that down.
* Some way to disable V. Maybe we just say `prctl(RISCV_ENABLE_V, 0, 0, ...)` disables V, or maybe it's a flag? Again, it should just be written down.
* What exactly we're enabling -- is it the V extension, or just the V registers?

There's a bunch of subtly here, though, so I think we'd at least want glibc and gdb support posted before committing to any uABI. It's probably also worth looking at what the Arm folks did for SVE: I gave it a quick glance and it seems like there's a lot of similarities with what I'm suggesting here, but again a lot of this is pretty subtle stuff so it's hard to tell just at a glance.

I know some of you had different opinion on this in the past [1], so
this is to make sure everyone's on same page.
And if we agree this is the way to go, how exactly will this be done in
userspace.

glibc dynamic loader will invoke the prctl() ? How will it decide
whether to do this (or not) - will it be unconditional or will it use
the hwcap - does latter plumbing exist already ? If so is it AT_HWCAP /
HWCAP2.

That part I haven't sorted out yet, and I don't think it's sufficient to just say "userspace should enable what it can support" because of how pervasive V instructions are going to be.

I don't think we need HWCAP, as userspace will need to call the prctl() anyway to turn on V and thus can just use the success/failure of that to sort things out.
Maybe it's sufficient to rely on some sort of sticky prctl() (or sysctl type thing, the differences there would be pretty subtle) and just not worry about it, but having some way of encoding this in the ELF seems nice. That said, we've had a bunch of trouble sorting out the ISA encoding in ELFs so maybe it's just not worth bothering?

Also for static linked executables, where will the prctl be called from ?

I guess that's pretty far in the weeds, but we could at least hook CRT to insert the relevant code. We'd really need to sort out how we're going to encode the V support in binaries, though.

[1] https://sourceware.org/pipermail/libc-alpha/2021-November/132883.html