Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq

From: Mathieu Desnoyers
Date: Tue Jul 14 2020 - 09:19:54 EST


----- On Jul 14, 2020, at 9:00 AM, Florian Weimer fweimer@xxxxxxxxxx wrote:

> * Mathieu Desnoyers:
>
>>> How are extensions going to affect the definition of struct rseq,
>>> including its alignment?
>>
>> The alignment will never decrease. If the structure becomes large enough
>> its alignment could theoretically increase. Would that be an issue ?
>
> Telling the compiler that struct is larger than it actually is, or that
> it has more alignment than in memory, results in undefined behavior,
> even if only fields are accessed in the smaller struct region.
>
> An increase in alignment from 32 to 64 is perhaps not likely to have
> this effect. But the undefined behavior is still there, and has been
> observed for mismatches like 8 vs 16.

Good points.

>
>>> As things stand now, glibc 2.32 will make the size and alignment of
>>> struct rseq part of its ABI, so it can't really change after that.
>>
>> Can the size and alignment of a structure be defined as minimum alignment
>> and size values ? For instance, those would be invariant for a given glibc
>> version (if we always use the internal struct rseq declaration), but could
>> be increased in future versions.
>
> Not if we are talking about a global (TLS) data symbol. No such changes
> are possible there. We have some workarounds for symbols that live
> exclusively within glibc, but they don't work if there are libraries out
> there which interpose the symbol.

OK

>
>>> With a different approach, we can avoid making the symbol size part of
>>> the ABI, but then we cannot use the __rseq_abi TLS symbol. As a result,
>>> interoperability with early adopters would be lost.
>>
>> Do you mean with a function "getter", and then keeping that pointer around
>> in a per-user TLS ? I would prefer to avoid that because it adds an extra
>> pointer dereference on a fast path.
>
> My choice would have been a function that returns the offset from the
> thread pointer (which has to be unchanged regarding all threads).

So AFAIU we would have glibc expose a symbol, e.g.:

off_t rseq_tls_offset(void);

Which would be typically called by user libraries and applications at initialization
to get the offset of the struct rseq. They should store it in a static variable so
rseq critical sections can use that offset.

Is there an arch-agnostic way to get the thread pointer from user-space code ? That
would be needed by all rseq critical section implementations.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com