Re: [PATCH net-next] tcp: add tcp_tx_skb_cache_key checking in sk_stream_alloc_skb()

From: Paolo Abeni
Date: Wed Sep 01 2021 - 12:01:13 EST


On Wed, 2021-09-01 at 17:25 +0200, Paolo Abeni wrote:
> On Wed, 2021-09-01 at 08:16 -0700, Eric Dumazet wrote:
> > On Wed, Sep 1, 2021 at 8:06 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > > On Wed, Sep 1, 2021 at 3:52 AM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> > > > On Wed, 2021-09-01 at 18:39 +0800, Yunsheng Lin wrote:
> > > > > Since tcp_tx_skb_cache is disabled by default in:
> > > > > commit 0b7d7f6b2208 ("tcp: add tcp_tx_skb_cache sysctl")
> > > > >
> > > > > Add tcp_tx_skb_cache_key checking in sk_stream_alloc_skb() to
> > > > > avoid possible branch-misses.
> > > > >
> > > > > Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> > > >
> > > > Note that MPTCP is currently exploiting sk->sk_tx_skb_cache. If we get
> > > > this patch goes in as-is, it will break mptcp.
> > > >
> > > > One possible solution would be to let mptcp usage enable sk-
> > > > > sk_tx_skb_cache, but that has relevant side effects on plain TCP.
> > > >
> > > > Another options would be re-work once again the mptcp xmit path to
> > > > avoid using sk->sk_tx_skb_cache.
> > > >
> > >
> > > Hmmm, I actually wrote a revert of this feature but forgot to submit
> > > it last year.
> > >
> > > commit c36cfbd791f62c0f7c6b32132af59dfdbe6be21b (HEAD -> listener_scale4)
> > > Author: Eric Dumazet <edumazet@xxxxxxxxxx>
> > > Date: Wed May 20 06:38:38 2020 -0700
> > >
> > > tcp: remove sk_{tr}x_skb_cache
> > >
> > > This reverts the following patches :
> > >
> > > 2e05fcae83c41eb2df10558338dc600dc783af47 ("tcp: fix compile error
> > > if !CONFIG_SYSCTL")
> > > 4f661542a40217713f2cee0bb6678fbb30d9d367 ("tcp: fix zerocopy and
> > > notsent_lowat issues")
> > > 472c2e07eef045145bc1493cc94a01c87140780a ("tcp: add one skb cache for tx")
> > > 8b27dae5a2e89a61c46c6dbc76c040c0e6d0ed4c ("tcp: add one skb cache for rx")
> > >
> > > Having a cache of one skb (in each direction) per TCP socket is fragile,
> > > since it can cause a significant increase of memory needs,
> > > and not good enough for high speed flows anyway where more than one skb
> > > is needed.
> > >
> > > We want instead to add a generic infrastructure, with more flexible per-cpu
> > > caches, for alien NUMA nodes.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> > >
> > > I will update this commit to also remove the part in MPTCP.
> > >
> > > Let's remove this feature and replace it with something less costly.
> >
> > Paolo, can you work on MPTP side, so that my revert can be then applied ?
>
> You are way too fast, I was still replying to your previous email,
> asking if I could help :)
>
> I'll a look ASAP. Please, allow for some latency: I'm way slower!

I think the easiest way and the one with less code duplication will
require accessing the tcp_mark_push() and skb_entail() helpers from the
MPTCP code, making them not static and exposing them e.g. in net/tcp.h.
Would that be acceptable or should I look for other options?

Thanks!

Paolo