Re: Regression: kernel 4.14 an later very slow with many ipsec tunnels

From: Wolfgang Walter
Date: Tue Oct 02 2018 - 10:46:00 EST


Hello,

Am Freitag, 14. September 2018, 07:54:37 schrieb Florian Westphal:
> Steffen Klassert <steffen.klassert@xxxxxxxxxxx> wrote:
> > On Thu, Sep 13, 2018 at 11:03:25PM +0200, Florian Westphal wrote:
> > > David Miller <davem@xxxxxxxxxxxxx> wrote:
> > > > From: Florian Westphal <fw@xxxxxxxxx>
> > > > Date: Thu, 13 Sep 2018 18:38:48 +0200
> > > >
> > > > > Wolfgang Walter <linux@xxxxxxx> wrote:
> > > > >> What I can say is that it depends mainly on number of policy rules
> > > > >> and SA.
> > > > >
> > > > > Thats already a good hint, I guess we're hitting long hash chains in
> > > > > xfrm_policy_lookup_bytype().
> > > >
> > > > I don't really see how recent changes can influence that.
> > >
> > > I don't think there is a recent change that did this.
> > >
> > > Walter says < 4.14 is ok, so this is likely related to flow cache
> > > removal.
> > >
> > > F.e. it looks like all prefixed policies end up in a linked list
> > > (net->xfrm.policy_inexact) and are not even in a hash table.
> > >
> > > I am staring at b58555f1767c9f4e330fcf168e4e753d2d9196e0
> > > but can't figure out how to configure that away from the
> > > 'no hashing for prefixed policies' default or why we even have
> > > policy_inexact in first place :/
> >
> > The hash threshold can be configured like this:
> >
> > ip x p set hthresh4 0 0
> >
> > This sets the hash threshold to local /0 and remote /0 netmasks.
> > With this configuration, all policies should go to the hashtable.
>
> Yes, but won't they all be hashed to same bucket?
>
> [ jhash(addr & 0, addr & 0) ] ?
>
> > Default hash thresholds are local /32 and remote /32 netmasks, so
> > all prefixed policies go to the inexact list.
>
> Yes.
>
> Wolfgang, before having to work on getting perf into your router image
> can you perhaps share a bit of info about the policies you're using?
>
> How many are there? Are they prefixed or not ("10.1.2.1")?

Since my last reply to this message I didn't get a reply: is there any
progress how to fix this performance regression I missed?

Or are we stuck here with longterm kernel 4.9 for a long time?


Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts