Re: [PATCH] rseq: don't promote transient TLS faults to SIGSEGV
From: Thomas Gleixner
Date: Mon Jun 08 2026 - 05:23:03 EST
On Mon, Jun 08 2026 at 10:15, Yuanhe Shu wrote:
> On return to user space the rseq slow path writes the new cpu_id /
> mm_cid into the user-space rseq TLS. rseq_update_usr() already
> classifies its failures in rseq_event::fatal: the flag is set only
> when corrupt user data is positively identified (e.g. a bad rseq_cs
> signature or an out-of-bounds abort IP) and stays clear when the
> access merely hit an unresolved page fault.
>
> rseq_slowpath_update_usr() ignores that and calls force_sig(SIGSEGV)
> on any failure, so a transient page fault on a still-registered rseq
> area becomes a fatal SIGSEGV. This is reachable since glibc >= 2.35
It's not transient.
rseq_slowpath_update_usr() does the full pagefault resolution, which
means if that returns without resolving the fault, then it's game over.
We also cannot return to user space in that case because the rseq area,
which is not accessible, has not been updated.
Thanks,
tglx