Re: [PATCH v4 1/5] getcpu_cache system call: cache CPU number of running thread

From: Peter Zijlstra
Date: Tue Mar 01 2016 - 16:32:19 EST


On Tue, Mar 01, 2016 at 08:23:12PM +0000, Mathieu Desnoyers wrote:
> I think it's important that user-space fast-paths can quickly
> detect whether the feature is enabled without having to rely on
> always reading a separate cache-line. I've put together an ABI
> proposal that take into account the feedback received so far.

Nah, adding detectoring code to fast paths is silly, makes them less
fast. Doesn't userspace have self modifying code? I know that at least
glibc does linker trickery to call different functions depending on
runtime context.

> struct thread_local_abi {
> /*
> * Thread-local ABI cpu_id field.
> * Updated by the kernel, and read by user-space with
> * single-copy atomicity semantics. Aligned on 32-bit.
> * Values:
> * >= 0: CPU number of running thread.
> * -1 (initial value): means the cpu_id feature is inactive.
> * -2: cpu_id feature is not available.
> */
> int32_t cpu_id;
>
> /*
> * Thread-local ABI rseq_seqnum field.
> * Updated by the kernel, and read by user-space with
> * single-copy atomicity semantics. Aligned on 32-bit.
> * Values:
> * >= 0: current seqnum for this thread (feature is active).
> * -1 (initial value): means the rseq feature is inactive.
> * -2: rseq feature is not available.
> */
> int32_t rseq_seqnum;

So I really hate that, that makes we have to check for these special
values whenever we increment the seq count and cannot have it wrap
naturally.