Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers

From: Ingo Molnar
Date: Wed Mar 09 2016 - 03:56:47 EST

* Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> musl implements system call cancellation in an unusual but clever way.

So I'm sceptical about the concept.

Could someone remind me why cancellation points matter to user-space?

I know the pthread APIs and semantics that are behind it, I just don't see how it
can be truly utilized for any meaningful programmatic property: for example the
moment you add any sort of ad-hoc printf() based tracing or any other spontaneous
logging IO to your application, you add in a lot of potential cancellation points
into various places in your user-space logic ...

It's _very_ easy to add inadvertent cancellation point to the code in practice, so
using the default pthread cancellation model and relying on what is a cancellation
point is crazy and very libc dependent in general. POSIX seems to be pretty vague
about it as well. So unless you make heavy use of pthread_setcancelstate() to
explicitly mark your work atoms, it's a really bad interface to rely on.

And if you are using pthread_setcancelstate(), instead of relying on calcellation,
then you are not really using the built-in cancellation points but have to spike
your code with pthread_testcancel(). In that case, why not just use your own
explicit 'cancellation' points in a few strategic places - which is mostly just a
simple flag really. That's what most worker thread models that I've seen use.

I suspect more complex runtimes like java runtimes couldn't care less, so it's
really something that only libc using C/C++ code cares about.

> When a thread issues a cancellable syscall, musl issues the syscall
> through a special thunk that looks roughly like this:
> cancellable_syscall:
> test whether a cancel is queued
> jnz cancel_me
> int $0x80
> end_cancellable_syscall:
> If a pthread cancellation signal hits with
> cancellable_syscall <= EIP < end_cancellable_syscall, then the
> signal interrupted a cancellation point before the syscall in
> question started. If so, it rewrites the calling context to skip
> the syscall and simulate a -EINTR return. The caller will detect
> this simulated -EINTR or an actual -EINTR and handle a possible
> cancellation event.

Why is so much complexity added to avoid a ~3 instructions window where
calcellation is tested? Cancellation at work atom boundaries is a fundamentally
'polling' model anyway, and signal delivery is asynchronous, with a fundamental
IPI delay if it's cross-CPU.

> This technique doesn't work if int $0x80 is replaced by a call to
> AT_SYSINFO: the signal handler can no longer tell whether it's
> interrupting a call to AT_SYSINFO or, if it is, where AT_SYSINFO was
> called from.
> Add minimal helpers so that musl's signal handler can learn the
> status of a possible pending AT_SYSINFO invocation and, if it hasn't
> entered the kernel yet, abort it without needing to parse the vdso
> DWARF unwind data.
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> ---
> musl people-
> Does this solve your AT_SYSINFO cancellation problem? I'd like to
> make sure it survives an actual implementation before I commit to the ABI.
> x86 people-
> Are you okay with this idea?
> arch/x86/entry/vdso/Makefile | 3 +-
> arch/x86/entry/vdso/vdso32/cancellation_helpers.c | 116 ++++++++++++++++++++++
> arch/x86/entry/vdso/vdso32/ | 2 +
> tools/testing/selftests/x86/unwind_vdso.c | 57 +++++++++--
> 4 files changed, 171 insertions(+), 7 deletions(-)
> create mode 100644 arch/x86/entry/vdso/vdso32/cancellation_helpers.c

I'd really like to see a cost/benefit analysis here! Some before/after explanation
- exactly what is not possible today (in practical terms), what are the practical
effects of not being able to do that, and how would the bright future look like?