Re: [PATCH v4 1/5] getcpu_cache system call: cache CPU number of running thread

From: Arnd Bergmann
Date: Mon Feb 29 2016 - 05:41:56 EST


On Monday 29 February 2016 11:32:21 Peter Zijlstra wrote:
> On Sun, Feb 28, 2016 at 12:39:54AM +0000, Mathieu Desnoyers wrote:
>
> > /* This structure needs to be aligned cache line size. */
> > struct thread_local_abi {
> > int32_t cpu_id;
> > uint32_t rseq_seqnum;
> > uint64_t rseq_post_commit_ip;
> > /* Add new fields at the end. */
> > } __attribute__((packed));
>
> I would really not use packed; that can lead to horrible layout.
>
> Suppose someone would add:
>
> uint32_t foo;
> uint64_t bar;
>
> With packed, you get an unaligned uint64_t in there, which is horrible.
> Without packed, you get a hole, which you can later fill.

What's making things worse is that on some architectures, adding
__packed will force access by bytes rather than just reading
a 32-bit or 64-bit numbers directly, so it's slow and non-atomic.

Arnd