Re: [PATCH] x86_64: inline csum_ipv6_magic()
From: Eric Dumazet
Date: Thu Nov 13 2025 - 13:18:21 EST
On Thu, Nov 13, 2025 at 8:26 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>
> On 11/13/25 07:45, Eric Dumazet wrote:
> > Inline this small helper.
> >
> > This reduces register pressure, as saddr and daddr are often
> > back to back in memory.
> >
> > For instance code inlined in tcp6_gro_receive() will look like:
>
> Could you please double check what the code growth is for this across
> the tree? There are 80-ish users of csum_ipv6_magic().
Hi Dave
Sure (allyesconfig build)
Before patch:
size vmlinux
text data bss dec hex filename
886947242 245613190 40211540 1172771972 45e71484 vmlinux
After patch:
size vmlinux
text data bss dec hex filename
886947242 245613190 40211540 1172771972 45e71484 vmlinux
I found this a bit surprising, so I did a regular build (our Google
production kernel default config)
Before:
size vmlinux
text data bss dec hex filename
34812872 22177397 5685248 62675517 3bc5a3d vmlinux
After:
size vmlinux
text data bss dec hex filename
34812501 22177365 5685248 62675114 3bc58aa vmlinux
So it would seem the patch saves 371 bytes for this config.
>
> Or, is there a discrete, measurable performance gain from doing this?
IPv6 incoming TCP/UDP paths call this function twice per packet, which is sad...
One call per TX packet.
Depending on the cpus I can see csum_ipv6_magic() using up to 0.75 %
of cpu cycles.
Then there is the cost in the callers, harder to measure...
Thank you.