Re: [GIT PULL] RISC-V updates for v7.0

From: Deepak Gupta

Date: Wed Feb 18 2026 - 20:57:57 EST


Hi Rick,

Comments inline.

On Wed, Feb 18, 2026 at 09:58:41PM +0000, Edgecombe, Rick P wrote:
On Wed, 2026-02-18 at 11:57 -0800, Deepak Gupta wrote:
Later in 2022 when I started doing work for cfi extensions on RISC-V, I started
with arch-agnostic prctl for enabling shadow stack and branch tracking. I was
hoping that all arches would end up converging on that. Soon Mark Brown sent
out arm's "Guarded Control Stack" (aka shadow stack) which used the
arch-agnostic shadow stack prctl.

So today we have
- arch specific shadow stack prctl for managing (enable/lock/disable) shadow
   stack on x86

- arch-agnostic prctl for shadow stack management on arm64 and RISC-V


If we land arch-agnostic prctl for enabling branch tracking for userspace as
part of risc-v patches, I am hoping we can leverage that for x86 "branch
tracking enabling" as well. I don't know if "BTI" is enabled for userspace in
the arm64 world but if it isn't then it can use the same prctl. This creates
symmetry and convergence as well between major 3 arches for branch tracking
support.

Arm already uses PROT_BTI to enable their landing pad like thing. It doesn't
need a prctl AFAIU. Peterz had been suggesting we do a similar PROT for x86 user
IBT. Although an additional prctl might still be required for x86. We'd have to
actually start taking the patches upstream to see.

x86 doesn't have any equivalent BTI bit in PTEs to mark code pages. IIRC, it
does have mechanism where a bitmap has to be prepared and each entry in bitmap
encodes whether a page is legacy code page (without `endbr64`) or a modern code
page (with `endbr64`). And CPU will consult this bitmap to suppress the fault.

As of today almost all distros and packages are shipping with
`-fcf-protection=full` compile flag and that means all userspace binaries should
have `endbr64` compiled in (i.e. modern binaries).

To be very specific, anyone using beyond 7.0 kernel (that's where x86 `endbr64`
enabling support will land) is most likely running latest userspace on it. Anyone
who is running old userspace anyways is so behind that its likely plagued with
vulnerabilities that an effort to enable cfi on it may be futile.

So if x86 were to follow arm model and use it's legacy interworking bitmap to
support mix of old (without `endbr64`) and new binaries in task's address space,
I see following problems:

- It'll need support in kernel to prepare and manage readonly bitmap in task's
address space. Along with code in kernel to support updating bitmap on
`dlopen` happening in userspace.

- Every indirect call to a legacy binary will lead to an additional load on
legacy interworking bitmap virt memory and thus will lead to perf issues. So
even in the case when someone has mix of old and new binaries, they most
likely won't enable it. Or try to make sure that they have all binaries in
task's address space with `endbr64` support.

- As of today no one is using legacy interworking bitmap part of Intel CET. And
I am also not sure how much of this hardware feature has been verified. So
likely, you may run into issues and errata first.

This is a lot of throwaway work to support a usecase which likely doesn't exist
because most distros anyway compile binaries with `endbr64` and even if such a
use case exist, its perf characteristics will suck quite bad and security
guarantees are also poor (worst of both world).

So my suggestion would be to keep it like shadow stack:

Loader takes a decision whether to enable forward cfi for current task or not
(like it is done for shadow stack)

Hopefully Intel will deprecate legacy interworking part of Intel CET in future
generation(s) and simplify it.



Furthermore, Control-flow integrity is shadow stack (for backward cfi) and
branch tracking (for forward cfi) both. It'll look odd and ugly (to an extent
it already is because x86 and arm64/riscv use different prctls for shstk) that
shadow stack has its own prctl and we invent new cfi prctl just for branch
tracking.

Ideally, it would have been nicer if we had
`PR_GET/SET_TASK_EXPLOIT_MITIGATIONS` and sub-codes under them to enable all
sort of things like "Manage CFI", "Manage memory tagging", "Manage speculation
control", etc. But things evolved on their own pace at different timelines.

During shadow stack/lam enabling we tried to create a generic x86 interface for
"per-thread features" with the idea that IBT would also go in there. However,
tglx made a bunch of points[0] against trying to do a universal thing.

After that attempt we kind of gave up and just let them be specific. Although,
it was not appreciated at the time that arm and riscv shadow stack would be so
similar.

I don't think we should have a generic "CFI" control though, because there are
other very different forms of backward edge CFI like PAC.

[0] https://lore.kernel.org/lkml/87zgjjqico.ffs@tglx/


Given that we already have a fragmentation in prctl space, I propose we go for
arch-agnostic branch tracking prctl and let other ISAs implement support as
they go about it.

I think the situation for forward edge isn't the same as shadow stack, where the
features matched so well. At least for ARM. My best guess is that x86 could
possibly use it if/when we get to user IBT. But best guess, would have a PROT
involved too. So that could be different semantics. Sorry, I never looked at
your forward edge patches. I think you don't have a PROT, right?


If you agree, I'll let Paul choose the right name for it (given that indir_lp
isn't a favorite)