On 23/03/2021 19.59, Oliver Hartkopp wrote:
On 23.03.21 15:00, Rasmus Villemoes wrote:
Now what CONFIG_* knobs are responsible for putting -mabi=apcs-gnu in
CFLAGS is left as an exercise for the reader. Regardless, it is not a
bug in the compiler. The error is the assumption that this language
"Aggregates and Unions
Structures and unions assume the alignment of their most strictly
aligned component.
(parse error in sentence)
It was a direct quote, but I can try to paraphrase with an example. If
you have a struct foo { T1 m1; T2 m2; T3 m3; }, then alignof(struct foo)
= max(alignof(T1), alignof(T2), alignof(T3)). Same for a "union foo".
But this is specifically for x86-64; for (some flavors of) ARM, other
rules apply - namely, alignof(T) is 4 unless T is char or short (or
(un)signed variants), ignoring bitfields which have their own rules.
Note that while
union u {char a; char b;}
has alignment 4 on ARM and 1 on x86-64, other types are less strictly
aligned on ARM; e.g. s64 aka long long is 8-byte aligned on x86-64 but
(still) just 4-byte aligned on ARM. And again, this is just for specific
-mabi= options.
Each member is assigned to the lowest available offset with the
appropriate
alignment. The size of any object is always a multiple of the object‘s
alignment."
from the x86-64 ABI applies on all other architectures/ABIs.
I'm not a compiler expert but this does not seem to be consistent.
Especially as we only have byte sizes (inside and outside of the union)
and "A field with a char type is aligned to the next available byte."
Yes, and that's exactly what you got before the anon union was
introduced.
Before(!) the union there is nothing to pad.
Just to be clear, my "before" was in the temporal sense, i.e. "prior to
commit ea7800565a128", all the u8s in struct can_frame were placed one
after the other. But after that commit, struct can_frame has a new
member replacing can_dlc which happens to occupy 4 bytes (for some
ABIs), pushing the subsequent members __pad, __res0 and len8_dlc
(formerly known as __res1) ahead.
The union is indeed aligned to the word boundary - but the following
byte is not aligned to the next available byte.
Yes it is, because the union occupies 4 bytes. The first byte is shared
by the two char members, the remaining three bytes are padding.
But why is the union 4 bytes long here and adds a padding of three bytes
at the end?
Essentially, because arrays. It's true for _any_ type T that sizeof(T)
must be a multiple of alignof(T). Take an array "T x[9]". If x[0] is
4-byte aligned, then in order for x[1] to be 4-byte aligned as well,
x[0] must occupy a multiple of 4 bytes.
It doesn't matter at all that this happens to be an anonymous union.
Layout-wise, you could as well have a definition
union uuu { __u8 len; __u8 can_dlc; }
and made struct can_frame
struct can_frame {
canid_t can_id;
union uuu u;
__u8 __pad;
...
};
(you lose the anonymity trick so you'd have to do frame->u.can_dlc
instead of just frame->can_dlc). You have a member with alignof()==4 and
sizeof()==4; that sizeof() cannot magically become 1 just because that
particular instance of the type is not part of an array. Imagine what
would happen if the compiler pulled subsequent char members into
trailing padding of a previous compound member. E.g. consider
struct a { int x; char y; } // alignof==4, sizeof==8, offsetof(y)==4
struct b { struct a a; char z; }
If I have a "struct b *b", I'm allowed to do "&b->a" and get a "pointer
to struct a". Then I can do memset(&b->a, 0, sizeof(struct a)). Clearly,
z must not have been placed inside the trailing padding of struct a.
Rasmus