Re: a question about IP checksum helper for arm64

From: Robin Murphy
Date: Mon Jul 09 2018 - 07:54:27 EST


Hi Bo,

On 06/07/18 17:27, Bo Yan wrote:
Hi Robin, Luke,

Recently I bumped into an error when running GCC undefined behavior sanitizer:

UBSAN: Undefined behaviour in kernel-4.9/arch/arm64/include/asm/checksum.h:34:6
ÂÂÂÂÂÂ load of misaligned address ffffffc198c8b254 for type 'const __int128 unsigned'
ÂÂÂÂÂÂ which requires 16 byte alignment

What's your config and reproducer here? I've had UBSan enabled a few times since that patch went in and never noticed anything. I've just tried it with 4.18-rc3 and indeed don't see anything from just booting the machine and making some network traffic. It does indeed fire if I also turn on CONFIG_UBSAN_ALIGNMENT, but then it's almost lost among a million other warnings for all manner of types - that's to be expected since, as the help text says, "Enabling this option on architectures that support unaligned accesses may produce a lot of false positives."


The relevant code:

ÂÂÂÂÂÂÂ tmp = *(const __uint128_t *)iph;
ÂÂÂÂÂÂÂ iph += 16;
ÂÂÂÂÂÂÂ ihl -= 4;
ÂÂÂÂÂÂÂ tmp += ((tmp >> 64) | (tmp << 64));
ÂÂÂÂÂÂÂ sum = tmp >> 64;
ÂÂÂÂÂÂÂ do {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sum += *(const u32 *)iph;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ iph += 4;
ÂÂÂÂÂÂÂ } while (--ihl);

But, I checked the generated disassembly, it doesn't look like anything special is generated taking advantage of that.

I'm using Linaro GCC 6.4-2017.08, expecting ldp instructions to be emitted, but don't see it.

My regular toolchain is currently Linaro 7.2.1-2017.11, but I also tried the last GCC 6 I had installed (6.3.1-2017.05), and for both at -O2 I see LDP emitted as expected for most of the identifiable int128 accesses (both in a standalone test harness and a quick survey of kernel code via 'aarch64-linux-gnu-objdump -S net/ipv4/*.o'). Of course, there may well be places where the compiler gets clever enough to elide all or part of that load where data is already held in registers - I've not audited *that* closely - but the whole point of having a pure C implementation is that it can be aggressively inlined more than inline asm ever could.

There were some prior discussions about GCC behavior, like this thread: https://patchwork.kernel.org/patch/9081911/ , in which you talked about the difference between GCC4 and GCC5.3. It looks to me this is regressed in Linaro GCC6.4 build.

I have not checked newer GCC versions.

Will it be more stable to just do this with inline assembly instead of relying on __uint128_t data type?

GCC documentation says __int128 is supported for targets which have an integer mode wide enough to hold 128 bits. aarch64 doesn't have such an integer mode.

Yet AArch64 GCC definitely does support __uint128_t, or this code wouldn't even build ;)

Robin.


Thanks

Bo