Re: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold()

From: Tiezhu Yang
Date: Mon Mar 15 2021 - 08:27:25 EST


On 03/15/2021 04:49 AM, Maciej W. Rozycki wrote:
On Tue, 9 Mar 2021, Tiezhu Yang wrote:

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1e6c135..80eddd4 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -128,9 +128,13 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
__u32 len, __u8 proto,
- __wsum sum)
+ __wsum sum_in)
{
- unsigned long tmp = (__force unsigned long)sum;
+#ifdef __clang__
+ unsigned long sum = (__force unsigned long)sum_in;
+#else
+ __wsum sum = sum_in;
+#endif
This looks much better to me, but I'd keep the variable names unchanged
as `sum_in' isn't used beyond the initial assignment anyway (you'll have
to update the references with asm operands accordingly of course).

Have you verified that code produced by GCC remains the same with your
change in place as it used to be up to commit 198688edbf77? I can see no
such information in the commit description whether here or in the said
commit.

Maciej

Hi Maciej,

Thanks for your reply.

gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110

net/ipv4/tcp_ipv4.c
tcp_v4_send_reset()
csum_tcpudp_nofold()

objdump -d vmlinux > vmlinux.dump

(1) Before commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used with Clang"):

ffffffff80aa835c: 00004025 move a4,zero
ffffffff80aa8360: 92020012 lbu v0,18(s0)
ffffffff80aa8364: de140030 ld s4,48(s0)
ffffffff80aa8368: 0064182d daddu v1,v1,a0
ffffffff80aa836c: 304200ff andi v0,v0,0xff
ffffffff80aa8370: 9c64000c lwu a0,12(v1)
ffffffff80aa8374: 9c660010 lwu a2,16(v1)
ffffffff80aa8378: afa70038 sw a3,56(sp)
ffffffff80aa837c: 24071a00 li a3,6656
ffffffff80aa8380: 0086202d daddu a0,a0,a2
ffffffff80aa8384: 0087202d daddu a0,a0,a3
ffffffff80aa8388: 0088202d daddu a0,a0,a4
ffffffff80aa838c: 0004083c dsll32 at,a0,0x0
ffffffff80aa8390: 0081202d daddu a0,a0,at
ffffffff80aa8394: 0081082b sltu at,a0,at
ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0
ffffffff80aa839c: 00812021 addu a0,a0,at

(2) After commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used with Clang"):

ffffffff80aa836c: 00004025 move a4,zero
ffffffff80aa8370: 92040012 lbu a0,18(s0)
ffffffff80aa8374: de140030 ld s4,48(s0)
ffffffff80aa8378: 0062182d daddu v1,v1,v0
ffffffff80aa837c: 308400ff andi a0,a0,0xff
ffffffff80aa8380: 9c62000c lwu v0,12(v1)
ffffffff80aa8384: 9c660010 lwu a2,16(v1)
ffffffff80aa8388: afa70038 sw a3,56(sp)
ffffffff80aa838c: 24071a00 li a3,6656
ffffffff80aa8390: 0046102d daddu v0,v0,a2
ffffffff80aa8394: 0047102d daddu v0,v0,a3
ffffffff80aa8398: 0048102d daddu v0,v0,a4
ffffffff80aa839c: 0002083c dsll32 at,v0,0x0
ffffffff80aa83a0: 0041102d daddu v0,v0,at
ffffffff80aa83a4: 0041082b sltu at,v0,at
ffffffff80aa83a8: 0002103f dsra32 v0,v0,0x0
ffffffff80aa83ac: 00411021 addu v0,v0,at

(3) With this patch:

ffffffff80aa835c: 00004025 move a4,zero
ffffffff80aa8360: 92020012 lbu v0,18(s0)
ffffffff80aa8364: de140030 ld s4,48(s0)
ffffffff80aa8368: 0064182d daddu v1,v1,a0
ffffffff80aa836c: 304200ff andi v0,v0,0xff
ffffffff80aa8370: 9c64000c lwu a0,12(v1)
ffffffff80aa8374: 9c660010 lwu a2,16(v1)
ffffffff80aa8378: afa70038 sw a3,56(sp)
ffffffff80aa837c: 24071a00 li a3,6656
ffffffff80aa8380: 0086202d daddu a0,a0,a2
ffffffff80aa8384: 0087202d daddu a0,a0,a3
ffffffff80aa8388: 0088202d daddu a0,a0,a4
ffffffff80aa838c: 0004083c dsll32 at,a0,0x0
ffffffff80aa8390: 0081202d daddu a0,a0,at
ffffffff80aa8394: 0081082b sltu at,a0,at
ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0
ffffffff80aa839c: 00812021 addu a0,a0,at

(4) With the following changes based on commit 198688edbf77
("MIPS: Fix inline asm input/output type mismatch in checksum.h used with Clang"):

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1e6c135..e1f80407 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -130,7 +130,11 @@ static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
__u32 len, __u8 proto,
__wsum sum)
{
+#ifdef __clang__
unsigned long tmp = (__force unsigned long)sum;
+#else
+ __wsum tmp = sum;
+#endif

__asm__(
" .set push # csum_tcpudp_nofold\n"

ffffffff80aa835c: 00004025 move a4,zero
ffffffff80aa8360: 92020012 lbu v0,18(s0)
ffffffff80aa8364: de140030 ld s4,48(s0)
ffffffff80aa8368: 0064182d daddu v1,v1,a0
ffffffff80aa836c: 304200ff andi v0,v0,0xff
ffffffff80aa8370: 9c64000c lwu a0,12(v1)
ffffffff80aa8374: 9c660010 lwu a2,16(v1)
ffffffff80aa8378: afa70038 sw a3,56(sp)
ffffffff80aa837c: 24071a00 li a3,6656
ffffffff80aa8380: 0086202d daddu a0,a0,a2
ffffffff80aa8384: 0087202d daddu a0,a0,a3
ffffffff80aa8388: 0088202d daddu a0,a0,a4
ffffffff80aa838c: 0004083c dsll32 at,a0,0x0
ffffffff80aa8390: 0081202d daddu a0,a0,at
ffffffff80aa8394: 0081082b sltu at,a0,at
ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0
ffffffff80aa839c: 00812021 addu a0,a0,at

The code produced by GCC remains the same between (1), (3) and (4),
the last changes looks like better (with less changes based on commit
198688edbf77), so I will send v3 later.

Thanks,
Tiezhu