Re: RISCV Vector unit disabled by default for new task (was Re: [PATCH v12 17/17] riscv: prctl to enable vector commands)

From: Andrew Waterman
Date: Fri Dec 09 2022 - 14:59:45 EST


On Fri, Dec 9, 2022 at 11:42 AM Vineet Gupta <vineetg@xxxxxxxxxxxx> wrote:
>
>
> On 12/9/22 09:21, Palmer Dabbelt wrote:
> > On Fri, 09 Dec 2022 05:04:23 PST (-0800), fweimer@xxxxxxxxxx wrote:
> >> * Darius Rad:
> >>
> >>> On Fri, Dec 09, 2022 at 01:32:33PM +0100, Florian Weimer via
> >>> Libc-alpha wrote:
> >>>> * Darius Rad:
> >>>>
> >>>> > On Fri, Dec 09, 2022 at 11:02:57AM +0100, Florian Weimer wrote:
> >>>> >> * Andrew Waterman:
> >>>> >>
> >>>> >> > This suggests that ld.so, early-stage libc, or possibly both
> >>>> will need
> >>>> >> > to make this prctl() call, perhaps by parsing the ELF headers
> >>>> of the
> >>>> >> > binary and each library to determine if the V extension is used.
> >>>> >>
> >>>> >> If the string functions use the V extension, it will be enabled
> >>>> >> unconditionally. So I don't see why it's okay for libc to
> >>>> trigger this
> >>>> >> alleged UAPI change, when the kernel can't do it by default.
> >>>> >>
> >>>> >
> >>>> > Because the call to enable can fail and userspace needs to deal
> >>>> with that.
> >>>>
> >>>> Failure is usually indicated by an AT_HWCAP or AT_HWCAP2 bit remaining
> >>>> zero, or perhaps a special CPU register (although that is more
> >>>> unusual).
> >>>
> >>> That would indicate that the extension is not present, which is one
> >>> of, but
> >>> not the only way it can fail.
> >>
> >> I think you should bring down the number of failure modes. HWCAP has
> >> the advantage that it communicates kernel/hypervisor/firmware/CPU
> >> support in a single bit, which simplifies the programming model and
> >> avoids hard-to-detect bugs. It's not clear why it would be beneficial
> >> to continue on ENOMEM failures here because the system must clearly be
> >> in bad shape at this point, and launching a new process is very unlikely
> >> to improve matters. So I think the simpler programming model is the way
> >> to go here.
> >>
> >>> The vector extension relies on dynamically allocated memory in the
> >>> kernel,
> >>> which can fail.
> >
> > The issue I'm worried about is that V needs more space in the
> > ucontext-type structures. We have an extensibility scheme there so we
> > think it can be made to work, but IIUC we'll need glibc to be updated
> > to handle the extended contexts in order to avoid losing state when
> > doing ucontext-related operations like signal handling.
>
> Sorry this is not relevant to this thread. I started a different thread
> on ABI/sigcontext aspects, lets discuss it there.
>
> >
> > I don't see a way to handle that with just HWCAP, as we essentially
> > need some bi-directional communicaton between userspace and the kernel
> > so they can both decide to turn on V. I don't think we strictly need
> > a system call to do that, we kicked around the idea of encoding this
> > in the ELF, but there's a lot of flavors of vector in RISC-V and we've
> > had trouble trying to encode these in binaries before so it seems
> > easier to just use the syscall.
> >
> >> But this failure can be reported as part of execve and clone.
> >>
> >>> It also provides the opportunity for the kernel to deny access to the
> >>> vector extension, perhaps due to administrative policy or other future
> >>> mechanism.
> >>
> >> HWCAP can do this, too.
>
> Having the prctl as general purpose knob to disable the V unit for
> various reasons makes sense.
>
> But keeping the V unit disabled by default and using prctl as a
> gatekeeper to enable it feels unnecessary and tedious.
> Here's my reasoning below (I'm collating comments from prior msgs as well).
>
> 1. Doesn't it add another userspace ABI which is already a headache for
> this feature. And that needs to be built into not just libc but
> potentially other runtimes too. Even after implemention there will be an
> interim pain as the new prctl takes time to trickle down into tooling
> and headers. Besides the new stuff will never be compatible with older
> kernel but that is a minor point since older kernel not supporting V is
> a deal breaker anyways.
>
> 2. People want the prctl gatekeeping for ability to gracefully handle
> memory allocation failure for the extra V-state within kernel. But that
> is only additional 4K (for typical 128 wide V regs) per task. If that is
> failing, the system is not doing well anyways. Besides it is not an
> issue at all since ENOMEM in clone/execve for the additional space
> should handle the failure anyways. Only very sophisticated apps would
> downgrade from executing V to Scalar code if the prctl failed. Instead
> memory allocation is more likely to be an issue when copying V state on
> a deep user stack across signal handling but there's nothing we can do
> about it.
>
> 3. Another argument to prctl gatekeeping is ensuring user-space conforms
> to vector length. But isn't the holy grail of RV V-extension VLA (Vector
> Length Agnostic) programming.

Yes, a suitable ABI for the V extension should cater cleanly to the
VLA model, since that's a major selling point of this ISA. The
baseline assumption should be that programs will execute correctly
regardless of VLEN (subject to the constraint that the V extension
requires VLEN >= 128, of course).

It's of course valid to construct programs with VLEN-dependent
behavior (e.g. dynamic dispatch to routines optimized differently for
different VLEN), but it should be considered the program's
responsibility to get that right. I don't think the ABI needs to
furnish guard rails.

> I expect most implements to follow that.
> If there are super sophisticated (or dumb) apps that don't follow, they
> will fail randomly. I think of Vector Length as any other ISA extensions
> - its not that currently apps are required to prctl() for bitmanip
> extension if they want to use it. Sure they could use AT_HWCAP (or
> /proc/cpuinfo or any other portable way) to query the capability, same
> can be done for V as well. Besides vlen is readable from user space so
> the app can read it to make sure. My worry is we are providing
> additional safety net to a small category of apps at the expense of
> making it tiresome for everyone else.
>
> HWCAP solves the kernel to user-space communication of capabilities. The
> prctl is for user-space to kernel communication but I'm not convinced it
> is really solving problems for the common case.
>
> Thx,
> -Vineet