Re: [RFC PATCH v3 for 4.15 08/24] Provide cpu_opv system call

From: Mathieu Desnoyers
Date: Mon Nov 20 2017 - 17:45:12 EST

----- On Nov 20, 2017, at 1:49 PM, Andi Kleen andi@xxxxxxxxxxxxxx wrote:

>> Having cpu_opv do a 4k memcpy allow it to handle scenarios where
>> rseq fails to progress.
> If anybody ever gets that right. It will be really hard to just
> test such a path.
> It also seems fairly theoretical to me. Do you even have a
> test case where the normal path stops making forward progress?

We expect the following loop to progress, typically after a single

do {
cpu = rseq_cpu_start();
ret = rseq_addv(&v, 1, cpu);
} while (ret);

Now runnig this in gdb, break on "main", run, and single-step
execution with "next", the program is stuck in an infinite loop.

What solution do you have in mind to handle this kind of
scenario without breaking pre-existing debuggers ?

Looking at vDSO examples of vgetcpu and vclock_gettime under
gdb 7.7.1 (debian) with glibc 2.19:

sched_getcpu behavior under single-stepping per source line
with "step" seems to only see the ../sysdeps/unix/sysv/linux/x86_64/sched_getcpu.S
source lines, which makes it skip single-stepping of the vDSO.

sched_getcpu under "stepi": it does go through the vDSO instruction
addresses. It does progress, given that there is no loop there.

clock_gettime under "step": it only sees source lines of

clock_gettime under "stepi": it's stuck in an infinite loop.

So instruction-level stepping from gdb turns clock_gettime vDSO
into a never-ending loop, which is already bad. But with rseq,
the situation is even worse, because it turns source line level
single-stepping into infinite loops.

My understanding from
is that GDB currently simply removes the vDSO from its list of library
mappings, which is probably why it skips over vDSO for the source
lines single-stepping case. We cannot do that with rseq, because we
_want_ the rseq critical section to be inlined into the application
or library. A function call costs more than most rseq critical sections.

I plan to have the rseq user-space code provide a "__rseq_table" section
so debuggers can eventually figure out that they need to skip over the
rseq critical sections. However, it won't help the fact that pre-existing
debugger single-stepping will start turning perfectly working programs
into never-ending loops simply by having glibc use rseq for memory

Using the cpu_opv system call on rseq failure solves this problem

I would even go further and recommend to take a similar approach when
lack of progress is detected in a vDSO, and invoke the equivalent
system call. The current implementation of the clock_gettime()
vDSO turns instruction-level single-stepping into never
ending loops, which is far from being elegant.



Mathieu Desnoyers
EfficiOS Inc.