Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers

From: Ingo Molnar
Date: Wed Mar 09 2016 - 03:56:47 EST

Next message: Jani Nikula: "Re: Kernel docs: muddying the waters a bit"
Previous message: Alessio Igor Bogani: "Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100"
In reply to: Andy Lutomirski: "[RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers"
Next in thread: Szabolcs Nagy: "Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> musl implements system call cancellation in an unusual but clever way.

So I'm sceptical about the concept.

Could someone remind me why cancellation points matter to user-space?

I know the pthread APIs and semantics that are behind it, I just don't see how it
can be truly utilized for any meaningful programmatic property: for example the
moment you add any sort of ad-hoc printf() based tracing or any other spontaneous
logging IO to your application, you add in a lot of potential cancellation points
into various places in your user-space logic ...

It's _very_ easy to add inadvertent cancellation point to the code in practice, so
using the default pthread cancellation model and relying on what is a cancellation
point is crazy and very libc dependent in general. POSIX seems to be pretty vague
about it as well. So unless you make heavy use of pthread_setcancelstate() to
explicitly mark your work atoms, it's a really bad interface to rely on.

And if you are using pthread_setcancelstate(), instead of relying on calcellation,
then you are not really using the built-in cancellation points but have to spike
your code with pthread_testcancel(). In that case, why not just use your own
explicit 'cancellation' points in a few strategic places - which is mostly just a
simple flag really. That's what most worker thread models that I've seen use.

I suspect more complex runtimes like java runtimes couldn't care less, so it's
really something that only libc using C/C++ code cares about.

> When a thread issues a cancellable syscall, musl issues the syscall
> through a special thunk that looks roughly like this:
>
> cancellable_syscall:
> test whether a cancel is queued
> jnz cancel_me
> int $0x80
> end_cancellable_syscall:
>
> If a pthread cancellation signal hits with
> cancellable_syscall <= EIP < end_cancellable_syscall, then the
> signal interrupted a cancellation point before the syscall in
> question started. If so, it rewrites the calling context to skip
> the syscall and simulate a -EINTR return. The caller will detect
> this simulated -EINTR or an actual -EINTR and handle a possible
> cancellation event.

Why is so much complexity added to avoid a ~3 instructions window where
calcellation is tested? Cancellation at work atom boundaries is a fundamentally
'polling' model anyway, and signal delivery is asynchronous, with a fundamental
IPI delay if it's cross-CPU.

> This technique doesn't work if int $0x80 is replaced by a call to
> AT_SYSINFO: the signal handler can no longer tell whether it's
> interrupting a call to AT_SYSINFO or, if it is, where AT_SYSINFO was
> called from.
>
> Add minimal helpers so that musl's signal handler can learn the
> status of a possible pending AT_SYSINFO invocation and, if it hasn't
> entered the kernel yet, abort it without needing to parse the vdso
> DWARF unwind data.
>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> ---
>
> musl people-
>
> Does this solve your AT_SYSINFO cancellation problem? I'd like to
> make sure it survives an actual implementation before I commit to the ABI.
>
> x86 people-
>
> Are you okay with this idea?
>
>
> arch/x86/entry/vdso/Makefile | 3 +-
> arch/x86/entry/vdso/vdso32/cancellation_helpers.c | 116 ++++++++++++++++++++++
> arch/x86/entry/vdso/vdso32/vdso32.lds.S | 2 +
> tools/testing/selftests/x86/unwind_vdso.c | 57 +++++++++--
> 4 files changed, 171 insertions(+), 7 deletions(-)
> create mode 100644 arch/x86/entry/vdso/vdso32/cancellation_helpers.c

I'd really like to see a cost/benefit analysis here! Some before/after explanation
- exactly what is not possible today (in practical terms), what are the practical
effects of not being able to do that, and how would the bright future look like?

Thanks,

Ingo

Next message: Jani Nikula: "Re: Kernel docs: muddying the waters a bit"
Previous message: Alessio Igor Bogani: "Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100"
In reply to: Andy Lutomirski: "[RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers"
Next in thread: Szabolcs Nagy: "Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]