Re: [RFC] csum experts, csum_replace2() is too expensive

From: Eric Dumazet
Date: Sun Mar 23 2014 - 22:50:23 EST


On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@xxxxxxxxx>
> Date: Fri, 21 Mar 2014 05:50:50 -0700
>
> > It looks like a barrier() would be more appropriate.
>
> barrier() == __asm__ __volatile__(:::"memory")

Indeed, but now you mention it, ip_fast_csum() do not uses volatile
keyword on x86_64, and has no "m" constraint either.

This means that for the following hypothetical networking code :

void foobar(struct iphdr *iph, __be16 newlen, __be16 newid)
{
iph->tot_len = newlen;
iph->check = 0;
iph->check = ip_fast_csum((u8 *)iph, 5);

pr_err("%p\n", iph);

iph->id = newid;
iph->check = 0;
iph->check = ip_fast_csum((u8 *)iph, 5);
}


ip_fast_csum() is done _once_ only.

Following patch seems needed. Thats one another call for x86 code factorization ...

diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index e6fd8a026c7b..c67778544880 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -46,7 +46,7 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
{
unsigned int sum;

- asm(" movl (%1), %0\n"
+ asm volatile(" movl (%1), %0\n"
" subl $4, %2\n"
" jbe 2f\n"
" addl 4(%1), %0\n"


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/