Re: [PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests

From: Guenter Roeck
Date: Fri Mar 01 2024 - 11:26:51 EST


On 2/29/24 22:46, Christophe Leroy wrote:


Le 26/02/2024 à 17:44, Guenter Roeck a écrit :
On 2/26/24 03:34, Christophe Leroy wrote:


Le 23/02/2024 à 23:11, Charlie Jenkins a écrit :
The test cases for ip_fast_csum and csum_ipv6_magic were not properly
aligning the IP header, which were causing failures on architectures
that do not support misaligned accesses like some ARM platforms. To
solve this, align the data along (14 + NET_IP_ALIGN) bytes which is the
standard alignment of an IP header and must be supported by the
architecture.

I'm still wondering what we are really trying to fix here.

All other tests are explicitely testing that it works with any alignment.

Shouldn't ip_fast_csum() and csum_ipv6_magic() work for any alignment as
well ? I would expect it, I see no comment in arm code which explicits
that assumption around those functions.

Isn't the problem only the following line, because csum_offset is
unaligned ?

csum = *(__wsum *)(random_buf + i + csum_offset);

Otherwise, if there really is an alignment issue for the IPv6 source or
destination address, isn't it enough to perform a 32 bits alignment ?


It isn't just arm.

Question should be what alignments the functions are supposed to be able
to handle, not what they are optimized for. If byte and/or half word
alignments
are expected to be supported, there is still architecture code which would
have to be fixed. Unaligned accesses are known to fail on hppa64/parisc64
and on sh4, for example. If unaligned accesses are expected to be handled,
it would probably make sense to add a separate test case, though, to
clarify
that the test fails due to alignment issues, not due to input parameters.


When you say "Unaligned accesses are known to fail on hppa64/parisc64
and on sh4", do you mean unaligned accesses in general or do you mean
ip_fast_csum() with unaligned ip header and csum_ipv6_magic() with
unaligned source and dest addresses ?

Because later in this thread it is said that only ARM and NIOS2
potentially have an issue.

And when you say "unaligned", to what level is that ? Is it 4-bytes
alignment or more or less ?


This e-mail chain is getting too long. Here is an attempt of a quick summary.

- Someone else suggested that unaligned accesses with nios2 should fail.
I since then tested and found that they pass at least for the checksum tests,
while dumping "unaligned access" messages into the kernel log. Other tests
(specifically gso) cause crashes, but that is unrelated.

- checksum tests on sh4 fail for unaligned data because of a bug introduced
to the architecture's checksum core with commit cadc4e1a2b4d ("sh: Handle
calling csum_partial with misaligned data"). The tests pass after reverting
that patch. I reported this, but that revert (or a fix of the problem it
introduced) has not been applied to linux-next.

- Checksum tests on unaligned data fail on parisc in mainline due to a number
of bugs in checksum assembler code (and with upstream qemu due to a bug in
qemu's hppa64 emulation). All those issues should by now be fixed in linux-next.
For reference, the following patches (SHAs from next-20240301) are needed to fix
the known problems:
0568b6f0d863 parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds
4b75b12d7050 parisc: Fix csum_ipv6_magic on 64-bit systems
4408ba75e4ba parisc: Fix csum_ipv6_magic on 32-bit systems
a2abae8f0b63 parisc: Fix ip_fast_csum
qemu (v8.2 and later) needs
https://lore.kernel.org/all/20240217015811.1975411-1-linux@xxxxxxxxxxxx/T/
for the hppa64/parisc64 tests to work with qemu.

- Checksum tests on unaligned data cause a crash on arm systems with "thumb"
instruction set enabled (such as mps2_defconfig and an385). I didn't bother
checking if the crash is with 1-byte or 2-byte alignment.

- There used to be a crash with checksum tests on m68k because of word alignment
which the implementation of the unit tests for csum_ipv6_magic() did not take
into account (this is fixed by the padding in struct csum_ipv6_magic_data).
I don't know if this patch is needed to fix that problem or if it was since
fixed differently.

I hope that covers everything. As I said above, the chain is getting long
and I may have missed something.

I am currently re-testing on all platforms/architectures available in qemu
with the known bugs outside lib/checksum_kunit.c fixed and with the sh4 patch
reverted, but without this patch. I'll send an update in response to the v11
patch as soon as I have the results.

Guenter