Re: Routing loops & TTL tracking with tunnel devices

From: Jason A. Donenfeld
Date: Thu Apr 28 2022 - 20:37:41 EST


Hey Eric,

On Tue, Nov 17, 2015 at 03:41:35AM +0100, Jason A. Donenfeld wrote:
> On Mon, Nov 16, 2015 at 11:28 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> > There is very little chance we'll accept a new member in sk_buff, unless
> > proven needed.
>
> I actually have no intention of doing this! I'm wondering if there
> already is a member in sk_buff that moonlights as my desired ttl
> counter, or if there's another mechanism for avoiding routing loops. I
> want to work with what's already there, rather than meddling with the
> innards of important and memory sensitive structures such as sk_buff.

Well, 7 years later... Maybe you have a better idea now of what I was
working on then. :)

As an update on this issue, it's still quasi problematic. To review, I
can't use the TTL value, because the outer packet always must get the
TTL of the route to the outer destination, not the inner packet minus
one. I can't rely on reaching MTU size, because people want this to work
with fragmentation (see [1] for my attempt to disallow fragmentation for
this issue, which resulted in hoots and hollers). I can't use the
per-cpu xmit_recursion variable, because I use threads.

What I can sort of use is taking advantage of what looks like a bug in
pskb expansion, such that it always allocates too much, and pretty
quickly fails allocations after a few loops. Only powerpc64 and s390x
don't appear to have this bug. See [2] for a description of this in
depth I wrote a few months ago to you.

Anyway, it'd be nice if there were a free u8 somewhere in sk_buff that I
could use for tracking times through the stack. Other kernels have this
but afaict Linux still does not. I looked into trying to overload some
existing fields -- tstamp/skb_mstamp_ns or queue_mapping -- which I was
thinking might be totally unused on TX?

Any ideas about this?

Thanks,
Jason

[1] https://lore.kernel.org/wireguard/CAHmME9rNnBiNvBstb7MPwK-7AmAN0sOfnhdR=eeLrowWcKxaaQ@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/netdev/CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@xxxxxxxxxxxxxx/