Re: [PATCH] riscv: signal: handle syscall restart before get_signal

From: Guo Ren
Date: Thu Aug 03 2023 - 21:09:45 EST


On Fri, Aug 4, 2023 at 6:45 AM Haorong Lu <ancientmodern4@xxxxxxxxx> wrote:
>
> In the current riscv implementation, blocking syscalls like read() may
> not correctly restart after being interrupted by ptrace. This problem
> arises when the syscall restart process in arch_do_signal_or_restart()
> is bypassed due to changes to the regs->cause register, such as an
> ebreak instruction.
>
> Steps to reproduce:
> 1. Interrupt the tracee process with PTRACE_SEIZE & PTRACE_INTERRUPT.
> 2. Backup original registers and instruction at new_pc.
> 3. Change pc to new_pc, and inject an instruction (like ebreak) to this
> address.
> 4. Resume with PTRACE_CONT and wait for the process to stop again after
> executing ebreak.
> 5. Restore original registers and instructions, and detach from the
> tracee process.
> 6. Now the read() syscall in tracee will return -1 with errno set to
> ERESTARTSYS.
>
> Specifically, during an interrupt, the regs->cause changes from
> EXC_SYSCALL to EXC_BREAKPOINT due to the injected ebreak, which is
> inaccessible via ptrace so we cannot restore it. This alteration breaks
> the syscall restart condition and ends the read() syscall with an
> ERESTARTSYS error. According to include/linux/errno.h, it should never
> be seen by user programs. X86 can avoid this issue as it checks the
> syscall condition using a register (orig_ax) exposed to user space.
> Arm64 handles syscall restart before calling get_signal, where it could
> be paused and inspected by ptrace/debugger.
>
> This patch adjusts the riscv implementation to arm64 style, which also
> checks syscall using a kernel register (syscallno). It ensures the
> syscall restart process is not bypassed when changes to the cause
> register occur, providing more consistent behavior across various
> architectures.
>
> For a simplified reproduction program, feel free to visit:
> https://github.com/ancientmodern/riscv-ptrace-bug-demo.
>
> Signed-off-by: Haorong Lu <ancientmodern4@xxxxxxxxx>
> ---
> arch/riscv/kernel/signal.c | 85 +++++++++++++++++++++-----------------
> 1 file changed, 46 insertions(+), 39 deletions(-)
>
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index 180d951d3624..d2d7169048ea 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -391,30 +391,6 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
> sigset_t *oldset = sigmask_to_save();
> int ret;
>
> - /* Are we from a system call? */
> - if (regs->cause == EXC_SYSCALL) {
> - /* Avoid additional syscall restarting via ret_from_exception */
> - regs->cause = -1UL;
> - /* If so, check system call restarting.. */
> - switch (regs->a0) {
> - case -ERESTART_RESTARTBLOCK:
> - case -ERESTARTNOHAND:
> - regs->a0 = -EINTR;
> - break;
> -
> - case -ERESTARTSYS:
> - if (!(ksig->ka.sa.sa_flags & SA_RESTART)) {
> - regs->a0 = -EINTR;
> - break;
> - }
> - fallthrough;
> - case -ERESTARTNOINTR:
> - regs->a0 = regs->orig_a0;
> - regs->epc -= 0x4;
> - break;
> - }
> - }
> -
> rseq_signal_deliver(ksig, regs);
>
> /* Set up the stack frame */
> @@ -428,35 +404,66 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
>
> void arch_do_signal_or_restart(struct pt_regs *regs)
> {
> + unsigned long continue_addr = 0, restart_addr = 0;
> + int retval = 0;
> struct ksignal ksig;
> + bool syscall = (regs->cause == EXC_SYSCALL);
>
> - if (get_signal(&ksig)) {
> - /* Actually deliver the signal */
> - handle_signal(&ksig, regs);
> - return;
> - }
> + /* If we were from a system call, check for system call restarting */
> + if (syscall) {
> + continue_addr = regs->epc;
> + restart_addr = continue_addr - 4;
> + retval = regs->a0;
>
> - /* Did we come from a system call? */
> - if (regs->cause == EXC_SYSCALL) {
> /* Avoid additional syscall restarting via ret_from_exception */
> regs->cause = -1UL;
>
> - /* Restart the system call - no handlers present */
> - switch (regs->a0) {
> + /*
> + * Prepare for system call restart. We do this here so that a
> + * debugger will see the already changed PC.
> + */
> + switch (retval) {
> case -ERESTARTNOHAND:
> case -ERESTARTSYS:
> case -ERESTARTNOINTR:
> - regs->a0 = regs->orig_a0;
> - regs->epc -= 0x4;
> - break;
> case -ERESTART_RESTARTBLOCK:
> - regs->a0 = regs->orig_a0;
> - regs->a7 = __NR_restart_syscall;
> - regs->epc -= 0x4;
> + regs->a0 = regs->orig_a0;
> + regs->epc = restart_addr;
> break;
> }
> }
>
> + /*
> + * Get the signal to deliver. When running under ptrace, at this point
> + * the debugger may change all of our registers.
> + */
> + if (get_signal(&ksig)) {
> + /*
> + * Depending on the signal settings, we may need to revert the
> + * decision to restart the system call, but skip this if a
> + * debugger has chosen to restart at a different PC.
> + */
> + if (regs->epc == restart_addr &&
> + (retval == -ERESTARTNOHAND ||
> + retval == -ERESTART_RESTARTBLOCK ||
> + (retval == -ERESTARTSYS &&
> + !(ksig.ka.sa.sa_flags & SA_RESTART)))) {
> + regs->a0 = -EINTR;
> + regs->epc = continue_addr;
> + }
> +
> + /* Actually deliver the signal */
> + handle_signal(&ksig, regs);
> + return;
> + }
> +
> + /*
> + * Handle restarting a different system call. As above, if a debugger
> + * has chosen to restart at a different PC, ignore the restart.
> + */
> + if (syscall && regs->epc == restart_addr && retval == -ERESTART_RESTARTBLOCK)
> + regs->a7 = __NR_restart_syscall;
> +
I thought your patch contains two parts:
1. bugfix
2. Some coding conventions or adjusting some logic of the original signal.

Could we separate them into two pieces and make the bugfix one
minimalistic? Then, people could easier to review your patches.

> /*
> * If there is no signal to deliver, we just put the saved
> * sigmask back.
> --
> 2.41.0
>


--
Best Regards
Guo Ren