Re: RISCV Vector unit disabled by default for new task (was Re: [PATCH v12 17/17] riscv: prctl to enable vector commands)

From: Andrew Waterman
Date: Fri Dec 09 2022 - 02:42:55 EST


Requiring application programmers (i.e. those who write main()) to
make a prctl() call is obviously completely unacceptable, because
application programmers don't know whether the V extension is being
used. Auto-vectorization and libc-function implementations will use
the V extension without any application-programmer knowledge or
intervention. And obviously we don't want to preclude that.

This suggests that ld.so, early-stage libc, or possibly both will need
to make this prctl() call, perhaps by parsing the ELF headers of the
binary and each library to determine if the V extension is used.

Personally, I'm agnostic to whether we put this onus on the kernel or
on user-space--I just want to make sure we're all on the same page
that it needs to be hidden behind libc/ld.so/etc. The onus can't be
on the application programmer.

On Thu, Dec 8, 2022 at 8:27 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote:
>
> On Thu, 08 Dec 2022 21:16:06 PST (-0800), Vineet Gupta wrote:
> > Hi Darius, Andrew, Palmer
> >
> > On 9/21/22 14:43, Chris Stillson wrote:
> >> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> >>
> >> @@ -134,7 +135,6 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
> >> if (WARN_ON(!vstate->datap))
> >> return;
> >> }
> >> - regs->status |= SR_VS_INITIAL;
> >>
> >
> > Perhaps not obvious from the patch, but this is a major user experience
> > change: As in V unit would be turned off for a new task and we will rely
> > on a userspace prctl (also introduced in this patch) to enable V.
>
> IMO that's the only viable option: enabling V adds more user-visible
> state, which is a uABI break. I haven't really had time to poke through
> all the versions here, but I'd have the call look something like
>
> prctl(RISCV_ENABLE_V, min_vlenb, max_vlenb, flags);
>
> where
>
> * min_vlenb is the smallest VLENB that userspace can support. There's
> alreday an LLVM argument for this, I haven't dug into the generated
> code but I assume it'll blow up on smaller VLENB systems somehow.
> * max_vlenb is the largest VLENB that userspace can support.
> * flags is just a placeholder for now, with 0 meaning "V as defined by
> 1.0 for all threads in this proces". That should give us an out if
> something more complicated happens in the future.
>
> That way VLA code can call `prctl(RISCV_ENABLE_V, 128, 8192, 0)` as it
> supports any V 1.0 implementation, while code with other constraints can
> avoid having V turned on in an unsupported configuration.

VLA code needs to read the vlenb CSR; it can't assume 8192 (or any
other small number) is a safe upper bound.

>
> I think we can start out with no flags, but there's a few I could see
> being useful already:
>
> * Cross process/thread enabling. I think a reasonable default is
> "enable V for all current and future threads in this process", but one
> could imagine flags for "just this thread" vs "all current threads", a
> default for new threads, and a default for child processes. I don't
> think it matters so much what we pick as a default, just that it's
> written down.
> * Setting the VLENB bounds vs updating them. I'm thinking for shared
> libraries, where they'd only want to enable V in the shared library if
> it's already in a supported configuration. I'm not sure what the
> right rules are here, but again it's best to write that down.
> * Some way to disable V. Maybe we just say `prctl(RISCV_ENABLE_V, 0, 0,
> ...)` disables V, or maybe it's a flag? Again, it should just be
> written down.
> * What exactly we're enabling -- is it the V extension, or just the V
> registers?
>
> There's a bunch of subtly here, though, so I think we'd at least want
> glibc and gdb support posted before committing to any uABI. It's
> probably also worth looking at what the Arm folks did for SVE: I gave it
> a quick glance and it seems like there's a lot of similarities with what
> I'm suggesting here, but again a lot of this is pretty subtle stuff so
> it's hard to tell just at a glance.
>
> > I know some of you had different opinion on this in the past [1], so
> > this is to make sure everyone's on same page.
> > And if we agree this is the way to go, how exactly will this be done in
> > userspace.
> >
> > glibc dynamic loader will invoke the prctl() ? How will it decide
> > whether to do this (or not) - will it be unconditional or will it use
> > the hwcap - does latter plumbing exist already ? If so is it AT_HWCAP /
> > HWCAP2.
>
> That part I haven't sorted out yet, and I don't think it's sufficient to
> just say "userspace should enable what it can support" because of how
> pervasive V instructions are going to be.
>
> I don't think we need HWCAP, as userspace will need to call the prctl()
> anyway to turn on V and thus can just use the success/failure of that to
> sort things out.
>
> Maybe it's sufficient to rely on some sort of sticky prctl() (or sysctl
> type thing, the differences there would be pretty subtle) and just not
> worry about it, but having some way of encoding this in the ELF seems
> nice. That said, we've had a bunch of trouble sorting out the ISA
> encoding in ELFs so maybe it's just not worth bothering?
>
> > Also for static linked executables, where will the prctl be called from ?
>
> I guess that's pretty far in the weeds, but we could at least hook CRT
> to insert the relevant code. We'd really need to sort out how we're
> going to encode the V support in binaries, though.
>
> > [1] https://sourceware.org/pipermail/libc-alpha/2021-November/132883.html