[tip: sched/core] rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

From: tip-bot2 for Piotr Figiel
Date: Wed Mar 17 2021 - 12:16:37 EST

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 90f093fa8ea48e5d991332cee160b761423d55c1
Gitweb: https://git.kernel.org/tip/90f093fa8ea48e5d991332cee160b761423d55c1
Author: Piotr Figiel <figiel@xxxxxxxxxx>
AuthorDate: Fri, 26 Feb 2021 14:51:56 +01:00
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitterDate: Wed, 17 Mar 2021 16:15:39 +01:00

rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

For userspace checkpoint and restore (C/R) a way of getting process state
containing RSEQ configuration is needed.

There are two ways this information is going to be used:
- to re-enable RSEQ for threads which had it enabled before C/R
- to detect if a thread was in a critical section during C/R

Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
using the address registered before C/R.

Detection whether the thread is in a critical section during C/R is needed
to enforce behavior of RSEQ abort during C/R. Attaching with ptrace()
before registers are dumped itself doesn't cause RSEQ abort.
Restoring the instruction pointer within the critical section is
problematic because rseq_cs may get cleared before the control is passed
to the migrated application code leading to RSEQ invariants not being
preserved. C/R code will use RSEQ ABI address to find the abort handler
to which the instruction pointer needs to be set.

To achieve above goals expose the RSEQ ABI address and the signature value
with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION.

This new ptrace request can also be used by debuggers so they are aware
of stops within restartable sequences in progress.

Signed-off-by: Piotr Figiel <figiel@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reviewed-by: Michal Miroslaw <emmir@xxxxxxxxxx>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20210226135156.1081606-1-figiel@xxxxxxxxxx
include/uapi/linux/ptrace.h | 10 ++++++++++
kernel/ptrace.c | 25 +++++++++++++++++++++++++
2 files changed, 35 insertions(+)

diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 83ee45f..3747bf8 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -102,6 +102,16 @@ struct ptrace_syscall_info {

+struct ptrace_rseq_configuration {
+ __u64 rseq_abi_pointer;
+ __u32 rseq_abi_size;
+ __u32 signature;
+ __u32 flags;
+ __u32 pad;
* These values are stored in task->ptrace_message
* by tracehook_report_syscall_* to describe the current syscall-stop.
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 821cf17..c71270a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -31,6 +31,7 @@
#include <linux/cn_proc.h>
#include <linux/compat.h>
#include <linux/sched/signal.h>
+#include <linux/minmax.h>

#include <asm/syscall.h> /* for syscall_get_* */

@@ -779,6 +780,24 @@ static int ptrace_peek_siginfo(struct task_struct *child,
return ret;

+static long ptrace_get_rseq_configuration(struct task_struct *task,
+ unsigned long size, void __user *data)
+ struct ptrace_rseq_configuration conf = {
+ .rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
+ .rseq_abi_size = sizeof(*task->rseq),
+ .signature = task->rseq_sig,
+ .flags = 0,
+ };
+ size = min_t(unsigned long, size, sizeof(conf));
+ if (copy_to_user(data, &conf, size))
+ return -EFAULT;
+ return sizeof(conf);
#define is_singlestep(request) ((request) == PTRACE_SINGLESTEP)
@@ -1222,6 +1241,12 @@ int ptrace_request(struct task_struct *child, long request,
ret = seccomp_get_metadata(child, addr, datavp);

+ ret = ptrace_get_rseq_configuration(child, addr, datavp);
+ break;