Re: [patch V3 07/12] rseq: Implement syscall entry work for time slice extensions

From: Prakash Sangappa

Date: Tue Nov 18 2025 - 19:21:38 EST




> On Oct 29, 2025, at 6:22 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> The kernel sets SYSCALL_WORK_RSEQ_SLICE when it grants a time slice
> extension. This allows to handle the rseq_slice_yield() syscall, which is
> used by user space to relinquish the CPU after finishing the critical
> section for which it requested an extension.
>
> In case the kernel state is still GRANTED, the kernel resets both kernel
> and user space state with a set of sanity checks. If the kernel state is
> already cleared, then this raced against the timer or some other interrupt
> and just clears the work bit.
>
> Doing it in syscall entry work allows to catch misbehaving user space,
> which issues a syscall from the critical section. Wrong syscall and
> inconsistent user space result in a SIGSEGV.
>
>

[…]

> +/*
> + * Invoked from syscall entry if a time slice extension was granted and the
> + * kernel did not clear it before user space left the critical section.
> + */
> +void rseq_syscall_enter_work(long syscall)
> +{

[…]

>
> + curr->rseq.slice.state.granted = false;
> + /*
> + * Clear the grant in user space and check whether this was the
> + * correct syscall to yield. If the user access fails or the task
> + * used an arbitrary syscall, terminate it.
> + */
> + if (put_user(0U, &curr->rseq.usrptr->slice_ctrl.all) || syscall != __NR_rseq_slice_yield)
> + force_sig(SIGSEGV);
> +}

I have been trying to get our Database team to implement changes to use the slice extension API.
They encounter the issue with a system call being made within the slice extension window and the
process dies with SEGV.

Apparently it will be hard to enforce not calling a system call in the slice extension window due to layering.
For the DB use case, It is fine to terminate the slice extension if a system call is made, but the process
getting killed will not work.

Thanks,
-Prakash