Re: [net-next] ipv6: fix routing cache overflow for raw sockets
From: Jonathan Maxwell
Date: Sat Dec 24 2022 - 02:38:52 EST
On Sat, Dec 24, 2022 at 7:28 AM Andrea Mayer <andrea.mayer@xxxxxxxxxxx> wrote:
>
> Hi Jon,
> please see below, thanks.
>
> On Wed, 21 Dec 2022 08:48:11 +1100
> Jonathan Maxwell <jmaxwell37@xxxxxxxxx> wrote:
>
> > On Tue, Dec 20, 2022 at 11:35 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> > >
> > > On Mon, 2022-12-19 at 10:48 +1100, Jon Maxwell wrote:
> > > > Sending Ipv6 packets in a loop via a raw socket triggers an issue where a
> > > > route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly
> > > > consumes the Ipv6 max_size threshold which defaults to 4096 resulting in
> > > > these warnings:
> > > >
> > > > [1] 99.187805] dst_alloc: 7728 callbacks suppressed
> > > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > > .
> > > > .
> > > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > >
> > > If I read correctly, the maximum number of dst that the raw socket can
> > > use this way is limited by the number of packets it allows via the
> > > sndbuf limit, right?
> > >
> >
> > Yes, but in my test sndbuf limit is never hit so it clones a route for
> > every packet.
> >
> > e.g:
> >
> > output from C program sending 5000000 packets via a raw socket.
> >
> > ip raw: total num pkts 5000000
> >
> > # bpftrace -e 'kprobe:dst_alloc {@count[comm] = count()}'
> > Attaching 1 probe...
> >
> > @count[a.out]: 5000009
> >
> > > Are other FLOWI_FLAG_KNOWN_NH users affected, too? e.g. nf_dup_ipv6,
> > > ipvs, seg6?
> > >
> >
> > Any call to ip6_pol_route(s) where no res.nh->fib_nh_gw_family is 0 can do it.
> > But we have only seen this for raw sockets so far.
> >
>
> In the SRv6 subsystem, the seg6_lookup_nexthop() is used by some
> cross-connecting behaviors such as End.X and End.DX6 to forward traffic to a
> specified nexthop. SRv6 End.X/DX6 can specify an IPv6 DA (i.e., a nexthop)
> different from the one carried by the IPv6 header. For this purpose,
> seg6_lookup_nexthop() sets the FLOWI_FLAG_KNOWN_NH.
>
Hi Andrea,
Thanks for pointing that datapath out. The more generic approach we are
taking bringing Ipv6 closer to Ipv4 in this regard should fix all instances
of this.
> > > > [1] 99.187805] dst_alloc: 7728 callbacks suppressed
> > > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > > .
> > > > .
> > > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
>
> I can reproduce the same warning messages reported by you, by instantiating an
> End.X behavior whose nexthop is handled by a route for which there is no "via".
> In this configuration, the ip6_pol_route() (called by seg6_lookup_nexthop())
> triggers ip6_rt_cache_alloc() because i) the FLOWI_FLAG_KNOWN_NH is present ii)
> and the res.nh->fib_nh_gw_family is 0 (as already pointed out).
>
Nice, when I get back after the holiday break I'll submit the next patch. It
would be great if you could test the new patch and let me know how it works in
your tests at that juncture. I'll keep you posted.
Regards
Jon
> > Regards
> >
> > Jon
>
> Ciao,
> Andrea