Re: [PATCH net] net: ensure all external references are released in deferred skbuffs
From: Eric Dumazet
Date: Wed Jun 22 2022 - 15:04:29 EST
On Wed, Jun 22, 2022 at 8:19 PM Ilya Maximets <i.maximets@xxxxxxx> wrote:
>
> On 6/22/22 19:03, Eric Dumazet wrote:
> > On Wed, Jun 22, 2022 at 6:47 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >>
> >> On Wed, Jun 22, 2022 at 6:39 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >>>
> >>> On Wed, Jun 22, 2022 at 6:29 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >>>>
> >>>> On Wed, Jun 22, 2022 at 4:26 PM Ilya Maximets <i.maximets@xxxxxxx> wrote:
> >>>>>
> >>>>> On 6/22/22 13:43, Eric Dumazet wrote:
> >>>>
> >>>>>
> >>>>> I tested the patch below and it seems to fix the issue seen
> >>>>> with OVS testsuite. Though it's not obvious for me why this
> >>>>> happens. Can you explain a bit more?
> >>>>
> >>>> Anyway, I am not sure we can call nf_reset_ct(skb) that early.
> >>>>
> >>>> git log seems to say that xfrm check needs to be done before
> >>>> nf_reset_ct(skb), I have no idea why.
> >>>
> >>> Additional remark: In IPv6 side, xfrm6_policy_check() _is_ called
> >>> after nf_reset_ct(skb)
> >>>
> >>> Steffen, do you have some comments ?
> >>>
> >>> Some context:
> >>> commit b59c270104f03960069596722fea70340579244d
> >>> Author: Patrick McHardy <kaber@xxxxxxxxx>
> >>> Date: Fri Jan 6 23:06:10 2006 -0800
> >>>
> >>> [NETFILTER]: Keep conntrack reference until IPsec policy checks are done
> >>>
> >>> Keep the conntrack reference until policy checks have been performed for
> >>> IPsec NAT support. The reference needs to be dropped before a packet is
> >>> queued to avoid having the conntrack module unloadable.
> >>>
> >>> Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
> >>> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
> >>>
> >>
> >> Oh well... __xfrm_policy_check() has :
> >>
> >> nf_nat_decode_session(skb, &fl, family);
> >>
> >> This answers my questions.
> >>
> >> This means we are probably missing at least one XFRM check in TCP
> >> stack in some cases.
> >> (Only after adding this XFRM check we can call nf_reset_ct(skb))
> >>
> >
> > Maybe this will help ?
>
> I tested this patch and it seems to fix the OVS problem.
> I did not test the xfrm part of it.
>
> Will you post an official patch?
Yes I will. I need to double check we do not leak either the req, or the child.
Maybe the XFRM check should be done even earlier, on the listening socket ?
Or if we assume the SYNACK packet has been sent after the XFRM test
has been applied to the SYN,
maybe we could just call nf_reset_ct(skb) to lower risk of regressions.
With the last patch, it would be strange that we accept the 3WHS and
establish a socket,
but drop the payload in the 3rd packet...
>
> >
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index fe8f23b95d32ca4a35d05166d471327bc608fa91..49c1348e40b6c7b6a98b54d716f29c948e00ba33
> > 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -2019,12 +2019,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > if (nsk == sk) {
> > reqsk_put(req);
> > tcp_v4_restore_cb(skb);
> > - } else if (tcp_child_process(sk, nsk, skb)) {
> > - tcp_v4_send_reset(nsk, skb);
> > - goto discard_and_relse;
> > } else {
> > - sock_put(sk);
> > - return 0;
> > + if (!xfrm4_policy_check(nsk, XFRM_POLICY_IN, skb)) {
> > + drop_reason = SKB_DROP_REASON_XFRM_POLICY;
> > + goto discard_and_relse;
> > + }
> > + nf_reset_ct(skb);
> > + if (tcp_child_process(sk, nsk, skb)) {
> > + tcp_v4_send_reset(nsk, skb);
> > + goto discard_and_relse;
> > + } else {
> > + sock_put(sk);
> > + return 0;
> > + }
> > }
> > }
>