Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers
From: Andy Lutomirski
Date: Thu Mar 10 2016 - 20:15:09 EST
On Thu, Mar 10, 2016 at 4:48 PM, Rich Felker <dalias@xxxxxxxx> wrote:
> On Fri, Mar 11, 2016 at 01:18:54AM +0100, Szabolcs Nagy wrote:
>> * Rich Felker <dalias@xxxxxxxx> [2016-03-10 18:28:20 -0500]:
>> > On Thu, Mar 10, 2016 at 07:03:31PM +0100, Ingo Molnar wrote:
>> > >
>> > > * Rich Felker <dalias@xxxxxxxx> wrote:
>> > >
>> > > > > So instead of a sticky cancellation flag, we could introduce a sticky
>> > > > > cancellation signal.
>> > > > >
>> > > > > A 'sticky signal' is not cleared from signal_pending() when the signal handler
>> > > > > executes, but it's automatically blocked so no signal handler recursion
>> > > > > occurs. (A sticky signal could still be cleared via a separate mechanism, by
>> > > > > the cancellation cleanup code.)
>> > > > >
>> > > > > Such a 'sticky cancellation signal' would, in the racy situation, cause new
>> > > > > blocking system calls to immediately return with -EINTR. Non-blocking syscalls
>> > > > > could still be used. (So the cancellation signal handler itself would still
>> > > > > have access to various fundamental system calls.)
>> > > > >
>> > > > > I think this would avoid messy coupling between the kernel's increasingly more
>> > > > > varied system call entry code and C libraries.
>> > > > >
>> > > > > Sticky signals could be requested via a new SA_ flag.
>> > > > >
>> > > > > What do you think?
>> > > >
>> > > > This still doesn't address the issue that the code making the syscall needs to
>> > > > be able to control whether it's cancellable or not. Not only do some syscalls
>> > > > whose public functions are cancellation points need to be used internally in
>> > > > non-cancellable ways; there's also the pthread_setcancelstate interface that
>> > > > allows deferring cancellation so that it's possible to call functions which are
>> > > > cancellation points without invoking cancellation.
>> > >
>> > > I don't think there's a problem - but I might be wrong:
>> > >
>> > > One way I think it would work is the following: a sticky signal is not the
>> > > cancellation flag - it's a helper construct to implement the flag in user-space in
>> > > a race-free way.
>> > >
>> > > Say you have RT signal-32 as the cancellation signal, and it's a sticky signal.
>> > >
>> > > When pthread_cancel() wants to cancel another thread, it first (atomically) sets
>> > > the desired cancel state of the target thread. If that state signals that the
>> > > thread is cancellable right now, and that we initiated its cancellation, then we
>> > > send signal-32. I.e. the signal only ever gets sent if the thread is in a
>> > > cancellable state.
>> > >
>> > > libc internal functions and the pthread_setcancelstate() API can temporarily
>> > > change the cancel state of a thread to non-cancellable - but pthread_cancel()
>> > > observes those state transitions.
>> > >
>> > > The 'sticky' nature of signal-32 will make a difference in the following race
>> > > condition, if the cancellation flag is checked before a system call by the C
>> > > library, and signal-32 arrives before the system call is executed. In that case
>> > > the 'sticky' nature of the signal makes sure that all subsequent system calls
>> > > return immediately.
>> > >
>> > > The sticky signal is only ever sent when the thread is in cancellable state - and
>> > > if the target thread notices the cancellation request before the signal arrives,
>> > > it first waits for its arrival before executing any new system calls (as part of
>> > > the teardown, etc.).
>> > >
>> > > So the C library never has to do complex work with a sticky signal pending.
>> > >
>> > > Does that make more sense to you?
>> >
>> > No, it doesn't work. Cancellability of the target thread at the time
>> > of the cancellation request (when you would decide whether or not to
>> > send the signal) has no relation to cancellability at the time of
>> > calling the cancellation point. Consider 2 threads A and B and the
>> > following sequence of events:
>> >
>> > 1. A has cancellation enabled
>> > 2. B calls pthread_cancel(A) and sets sticky pending signal
>> > 3. A disables cancellation
>> > 4. A calls cancellation point and syscall wrongly gets interrupted
>> >
>> > This can be solved with more synchronization in pthread_cancel and
>> > pthread_setcancelstate, but it seems costly. pthread_setcancelstate
>> > would have to clear pending sticky cancellation signals, and any
>> > internal non-cancellable syscalls would have to be made using the same
>> > mechanism (effectively calling pthread_setcancelstate). A naive
>> > implementation of such clearing would involve a syscall itself,
>>
>> i think a syscall in setcancelstate in case of pending sticky signal
>> is not that bad given that cancellation is very rarely used.
>
> I agree, but it's not clear to me whether you could eliminate syscalls
> in the case where it's not pending, since AS-safe lock machinery is
> hard to get right. I don't see a way it can be done with just atomics
> because the syscall that sends the signal cannot be atomic with the
> memory operating setting a flag, which suggests a lock is needed, and
> then there are all sorts of issues to deal with.
>
>> however maintaining two completely different cancellation designs
>> is expensive and only the current one works on old kernels.
>
> Indeed. I think it would be hard to justify supporting a new one in
> musl unless there's some easy way to isolate the complexity of having
> both, being that vdso syscall is of marginal value to begin with
> anyway...
I would argue that vdso syscalls are of considerably more than
marginal utility. They are vastly faster. The difference isn't
subtle.
However... while it seems straightforward that a pthread cancellation
implementation should be correct, is there any reason that it needs to
fast? After all musl could always do:
if (this thread is cancellable right now)
use int $0x80 and eat the performance hit
else
call AT_SYSINFO
Aside from a branch, this adds minimal overhead to sane programs, and
programs crazy enough to try use pthread cancellation get penalized on
x86_32.
If I read it right, that's what musl already does. But I could be
reading it wrong:
if ((st=(self=__pthread_self())->canceldisable)
&& (st==PTHREAD_CANCEL_DISABLE || nr==SYS_close))
return __syscall(nr, u, v, w, x, y, z);
... slow path ...
Figuring out what "(st=(self=__pthread_self())->canceldisable) && ..."
does and why requires more cross-referencing that I care to do right
now. Shouldn't that "&&" at least be a "," or perhaps just be a
separate statement? On glibc, at least, PTHREAD_CANCEL_DISABLE == 1,
so this looks nearly tautological.
--Andy