Re: [PATCH] x86, retpolines: raise limit for generating indirect calls from switch-case

From: Jesper Dangaard Brouer
Date: Fri Feb 22 2019 - 02:31:42 EST

On Thu, 21 Feb 2019 23:19:41 +0100
Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:

> Recent work on XDP from BjÃrn and Magnus additionally found that
> manually transforming the XDP return code switch statement with
> more than 5 cases into if-else combination would result in a
> considerable speedup in XDP layer due to avoidance of indirect
> calls in CONFIG_RETPOLINE enabled builds. On i40e driver with
> XDP prog attached, a 20-26% speedup has been observed [0]. Aside
> from XDP, there are many other places later in the networking
> stack's critical path with similar switch-case processing. Rather
> than fixing every XDP-enabled driver and locations in stack by
> hand, it would be good to instead raise the limit where gcc would
> emit expensive indirect calls from the switch under retpolines

I'm very happy to see this. Thanks to BjÃrn for finding, analyzing and
providing hand-coded-if-else code that demonstrated the performance
issue for XDP. But I do think this GCC case-values-threshold param is
a better and more generic solution to the issue we observed and
measured in XDP land. And hopefully other parts of the network stack
and kernel will also benefit.

Acked-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>

Thanks for following up on this Daniel,
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat