Re: [PATCH net] net: ensure all external references are released in deferred skbuffs

From: Eric Dumazet
Date: Wed Jun 22 2022 - 12:50:23 EST


On Wed, Jun 22, 2022 at 6:39 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Wed, Jun 22, 2022 at 6:29 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > On Wed, Jun 22, 2022 at 4:26 PM Ilya Maximets <i.maximets@xxxxxxx> wrote:
> > >
> > > On 6/22/22 13:43, Eric Dumazet wrote:
> >
> > >
> > > I tested the patch below and it seems to fix the issue seen
> > > with OVS testsuite. Though it's not obvious for me why this
> > > happens. Can you explain a bit more?
> >
> > Anyway, I am not sure we can call nf_reset_ct(skb) that early.
> >
> > git log seems to say that xfrm check needs to be done before
> > nf_reset_ct(skb), I have no idea why.
>
> Additional remark: In IPv6 side, xfrm6_policy_check() _is_ called
> after nf_reset_ct(skb)
>
> Steffen, do you have some comments ?
>
> Some context:
> commit b59c270104f03960069596722fea70340579244d
> Author: Patrick McHardy <kaber@xxxxxxxxx>
> Date: Fri Jan 6 23:06:10 2006 -0800
>
> [NETFILTER]: Keep conntrack reference until IPsec policy checks are done
>
> Keep the conntrack reference until policy checks have been performed for
> IPsec NAT support. The reference needs to be dropped before a packet is
> queued to avoid having the conntrack module unloadable.
>
> Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
>

Oh well... __xfrm_policy_check() has :

nf_nat_decode_session(skb, &fl, family);

This answers my questions.

This means we are probably missing at least one XFRM check in TCP
stack in some cases.
(Only after adding this XFRM check we can call nf_reset_ct(skb))

>
> >
> > I suspect some incoming packets are not going through
> > xfrm4_policy_check() and end up being stored in a TCP receive queue.
> >
> > Maybe something is missing before calling tcp_child_process()
> >
> >
> > >
> > > >
> > > > I note that IPv6 does the nf_reset_ct() earlier, from ip6_protocol_deliver_rcu()
> > > >
> > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > index fda811a5251f2d76ac24a036e6b4f4e7d7d96d6f..a06464f96fe0cc94dd78272738ddaab2c19e87db
> > > > 100644
> > > > --- a/net/ipv4/tcp_ipv4.c
> > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > @@ -1919,6 +1919,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > > struct sock *sk;
> > > > int ret;
> > > >
> > > > + nf_reset_ct(skb);
> > > > +
> > > > drop_reason = SKB_DROP_REASON_NOT_SPECIFIED;
> > > > if (skb->pkt_type != PACKET_HOST)
> > > > goto discard_it;
> > > > @@ -2046,8 +2048,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > > > if (drop_reason)
> > > > goto discard_and_relse;
> > > >
> > > > - nf_reset_ct(skb);
> > > > -
> > > > if (tcp_filter(sk, skb)) {
> > > > drop_reason = SKB_DROP_REASON_SOCKET_FILTER;
> > > > goto discard_and_relse;
> > >