RE: GCC, unaligned access and UB in the Linux kernel

From: David Laight
Date: Wed May 05 2021 - 06:41:54 EST

Next message: Nandor Han: "[PATCH v4 1/4] dt-bindings: nvmem: Add bootcount-nvmem"
Previous message: Jan Kara: "Re: [PATCH v3] ext4: Fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed"
In reply to: Willy Tarreau: "Re: GCC, unaligned access and UB in the Linux kernel"
Next in thread: Vladislav K. Valtchev: "Re: GCC, unaligned access and UB in the Linux kernel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Vladislav K. Valtchev
> Sent: 04 May 2021 19:08
>
> Hi everyone,
>
> I noticed that in the Linux kernel we have a define called
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS that's used in a fair amount of places
> with the following purpose: when it's set, unaligned access is supported by the
> CPU so we can do it directly, otherwise fall-back to some logic where a byte at
> the time is read/written. For example, check the implementation of
> do_strncpy_from_user():
> https://elixir.bootlin.com/linux/latest/source/lib/strncpy_from_user.c#L15
>
>
> That's nice and seems to work today as expected, but there's one problem:
> unaligned access is UB according to the ISO C standard, ...

The C standard isn't relevant.

...
>
> Mitigations
> ------------
> In my understanding, the simplest way to force GCC to emit a single MOV
> instruction with unaligned access, without risking any kind of UB, is to use
> __builtin_memcpy(). It works great, but it requires fixing the code in many
> places. Also, maybe using get_unaligned()/put_unaligned() is the right thing to
> do? The problem is that the put_unaligned_* inline functions don't use
> __builtin_memcpy() and are defined like:
>
> static __always_inline void put_unaligned_le32(u32 val, void *p)
> {
> *((__le32 *)p) = cpu_to_le32(val);
> }
>
> So, still UB. To make the compiler happy, maybe we should make them use
> __builtin_memcpy()?

That doesn't help you at all.
If the compiler is given any hint that the address is aligned
it will use potentially misaligned memory accesses.
So if the above was:
*(int __attribute__((packed)) *)p = val;
and the caller had:
int *p = maybe_unaligned();
put_unaligned_le32(0, p);

Then gcc will generate a 32bit write even on sparc.
__builtin_memcpy() will expand to exactly the same 32bit write.

This is nothing new - at least 20 years.

Basically, the C standard only allows you to cast a pointer
to 'void *' and then back to its original type.
GCC makes use of this for various optimisations and will
track the alignment through (void *) casts.

I believe that, even for the simd instructions, gcc will pick
the 'misalign supporting' version if the data is marked
with the relevant alignment.

Oh - and the loop vectorisation code is a pile of crap.
You almost never want it - look at the code it generates
for a loop you go round a few times.
It you are doing 10000 iteractions you'll need to unroll
it yourself to get decent performance.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Next message: Nandor Han: "[PATCH v4 1/4] dt-bindings: nvmem: Add bootcount-nvmem"
Previous message: Jan Kara: "Re: [PATCH v3] ext4: Fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed"
In reply to: Willy Tarreau: "Re: GCC, unaligned access and UB in the Linux kernel"
Next in thread: Vladislav K. Valtchev: "Re: GCC, unaligned access and UB in the Linux kernel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]