Re: [REGRESSION] rseq: refactoring in v6.19 broke everyone on arm64 and tcmalloc everywhere
From: Mark Rutland
Date: Wed Apr 22 2026 - 14:17:21 EST
On Wed, Apr 22, 2026 at 07:49:30PM +0200, Thomas Gleixner wrote:
> On Wed, Apr 22 2026 at 14:09, Mark Rutland wrote:
> > On Wed, Apr 22, 2026 at 11:50:26AM +0200, Mathias Stearn wrote:
> >> TL;DR: As of 6.19, rseq no longer provides the documented atomicity
> >> guarantees on arm64 by failing to abort the critical section on same-core
> >> preemption/resumption. Additionally, it breaks tcmalloc specifically by
> >> failing to overwrite the cpu_id_start field at points where it was relied
> >> on for correctness.
> >
> > Thanks for the report, and the test case.
> >
> > As a holding reply, I'm looking into this now from the arm64 side.
>
> I assume it's the partial conversion to the generic entry code which
> screws that up.
It's slightly more than that, but in a sense, yes. ;)
The fix is conceptually simple, but I'll need to do some refactoring.
Conceptually we just need to use syscall_enter_from_user_mode() and
irqentry_enter_from_user_mode() appropriately.
In practice, I can't use those as-is without introducing the exception
masking problems I just fixed up for irqentry_enter_from_kernel_mode(),
so I'll need to do some similar refactoring first.
That and I *think* a couple of of the current checks for CONFIG_GENERIC_ENTRY
should be checking CONFIG_GENERIC_IRQ_ENTRY, since all of the relevant
bits are in the generic irqentry code rather than the GENERIC_SYSCALL
code (and GENERIC_ENTRY is just GENERIC_IRQ_ENTRY + GENERIC_SYSCALL).
> The problem reproduces with rseq selftests nicely.
Ah; that's both good to know, and worrying that we've never had a report
from all the automated testing people are supposedly running. :/
> The patch below fixes it as it puts ARM64 back to the non-optimized code
> for now. Once ARM64 is fully converted it gets all the nice improvements.
Thanks; I'll give that a test tomorrow.
I haven't paged everything in yet, so just to cehck, is there anything
that would behave incorrectly if current->rseq.event.user_irq were set
for syscall entry? IIUC it means we'll effectively do the slow path, and
I was wondering if that might be acceptable as a one-line bodge for
stable.
As above, I'd like if the actual fix could make this work for
GENERIC_IRQ_ENTRY rather than GENERIC_ENTRY, since that way we can make
this work as it was supposed to *before* moving to GENERIC_SYSCALL
(which has a whole lot more ABI impact to worry about).
I think that just needs a small amount of refactoring that arm64 will
need regardless.
Mark.
>
> Thanks,
>
> tglx
> ---
> diff --git a/include/linux/rseq.h b/include/linux/rseq.h
> index 2266f4dc77b6..d55476e2a336 100644
> --- a/include/linux/rseq.h
> +++ b/include/linux/rseq.h
> @@ -30,7 +30,7 @@ void __rseq_signal_deliver(int sig, struct pt_regs *regs);
> */
> static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs)
> {
> - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> + if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
> /* '&' is intentional to spare one conditional branch */
> if (current->rseq.event.has_rseq & current->rseq.event.user_irq)
> __rseq_signal_deliver(ksig->sig, regs);
> @@ -50,7 +50,7 @@ static __always_inline void rseq_sched_switch_event(struct task_struct *t)
> {
> struct rseq_event *ev = &t->rseq.event;
>
> - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> + if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
> /*
> * Avoid a boat load of conditionals by using simple logic
> * to determine whether NOTIFY_RESUME needs to be raised.
> diff --git a/include/linux/rseq_entry.h b/include/linux/rseq_entry.h
> index a36b472627de..8ccd464a108d 100644
> --- a/include/linux/rseq_entry.h
> +++ b/include/linux/rseq_entry.h
> @@ -80,7 +80,7 @@ bool rseq_debug_validate_ids(struct task_struct *t);
>
> static __always_inline void rseq_note_user_irq_entry(void)
> {
> - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY))
> + if (IS_ENABLED(CONFIG_GENERIC_ENTRY))
> current->rseq.event.user_irq = true;
> }
>
> @@ -171,8 +171,8 @@ bool rseq_debug_update_user_cs(struct task_struct *t, struct pt_regs *regs,
> if (unlikely(usig != t->rseq.sig))
> goto die;
>
> - /* rseq_event.user_irq is only valid if CONFIG_GENERIC_IRQ_ENTRY=y */
> - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> + /* rseq_event.user_irq is only valid if CONFIG_GENERIC_ENTRY=y */
> + if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
> /* If not in interrupt from user context, let it die */
> if (unlikely(!t->rseq.event.user_irq))
> goto die;
> @@ -387,7 +387,7 @@ static rseq_inline bool rseq_update_usr(struct task_struct *t, struct pt_regs *r
> * allows to skip the critical section when the entry was not from
> * a user space interrupt, unless debug mode is enabled.
> */
> - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> + if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
> if (!static_branch_unlikely(&rseq_debug_enabled)) {
> if (likely(!t->rseq.event.user_irq))
> return true;