Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation

From: Rich Felker
Date: Thu Nov 22 2018 - 10:45:55 EST


On Thu, Nov 22, 2018 at 10:33:19AM -0500, Mathieu Desnoyers wrote:
> ----- On Nov 22, 2018, at 10:21 AM, Florian Weimer fweimer@xxxxxxxxxx wrote:
>
> > * Rich Felker:
> >
> >> On Thu, Nov 22, 2018 at 04:11:45PM +0100, Florian Weimer wrote:
> >>> * Mathieu Desnoyers:
> >>>
> >>> > Thoughts ?
> >>> >
> >>> > /* Unregister rseq TLS from kernel. */
> >>> > if (has_rseq && __rseq_unregister_current_thread ())
> >>> > abort();
> >>> >
> >>> > advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
> >>> > pd->guardsize);
> >>> >
> >>> > /* If the thread is detached free the TCB. */
> >>> > if (IS_DETACHED (pd))
> >>> > /* Free the TCB. */
> >>> > __free_tcb (pd);
> >>>
> >>> Considering that we proceed to free the TCB, I really hope that all
> >>> signals are blocked at this point. (I have not checked this, though.)
> >>>
> >>> Wouldn't this address your concern about access to the rseq area?
> >>
> >> I'm not familiar with glibc's logic here, but for other reasons, I
> >> don't think freeing it is safe until the kernel task exit futex (set
> >> via clone or set_tid_address) has fired. I would guess __free_tcb just
> >> sets up for it to be reclaimable when this happens rather than
> >> immediately freeing it for reuse.
> >
> > Right, but in case of user-supplied stacks, we actually free TLS memory
> > at this point, so signals need to be blocked because the TCB is
> > (partially) gone after that.
>
> Unfortuntately, disabling signals is not enough.
>
> With rseq registered, the kernel accesses the rseq TLS area when returning to
> user-space after _preemption_ of user-space, which can be triggered at any
> point by an interrupt or a fault, even if signals are blocked.
>
> So if there are cases where the TLS memory is freed while the thread is still
> running, we _need_ to explicitly unregister rseq beforehand.

OK, that makes sense. I was wrongly under the impression that the TLS
memory could not be reused until the task exit futex fired, but in
glibc that's not the case with caller-provided stacks.

I still don't understand the need for a reference count though.

Rich