Re: [PATCH net v2] net/sched: Fix UAF when resolving a clash
From: Chengen Du
Date: Mon Jul 08 2024 - 05:43:03 EST
On Mon, Jul 8, 2024 at 4:33 PM Michal Kubiak <michal.kubiak@xxxxxxxxx> wrote:
>
> On Sat, Jul 06, 2024 at 09:42:00AM +0800, Chengen Du wrote:
>
> [...]
>
> > >
> > > > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> > > > > index 2a96d9c1db65..6f41796115e3 100644
> > > > > --- a/net/sched/act_ct.c
> > > > > +++ b/net/sched/act_ct.c
> > > > > @@ -1077,6 +1077,14 @@ TC_INDIRECT_SCOPE int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
> > > > > */
> > > > > if (nf_conntrack_confirm(skb) != NF_ACCEPT)
> > > > > goto drop;
> > > > > +
> > > > > + /* The ct may be dropped if a clash has been resolved,
> > > > > + * so it's necessary to retrieve it from skb again to
> > > > > + * prevent UAF.
> > > > > + */
> > > > > + ct = nf_ct_get(skb, &ctinfo);
> > > > > + if (!ct)
> > > > > + goto drop;
> > > >
> > > > After taking a closer look at this change, I have a question: Why do we
> > > > need to change an action returned by "nf_conntrack_confirm()"
> > > > (NF_ACCEPT) and actually perform the flow for NF_DROP?
> > > >
> > > > From the commit message I understand that you only want to prevent
> > > > calling "tcf_ct_flow_table_process_conn()". But for such reason we have
> > > > a bool variable: "skip_add".
> > > > Shouldn't we just set "skip_add" to true to prevent the UAF?
> > > > Would the following example code make sense in this case?
> > > >
> > > > ct = nf_ct_get(skb, &ctinfo);
> > > > if (!ct)
> > > > skip_add = true;
> >
> > The fix is followed by the KASAN analysis. The ct is freed while
> > resolving a clash in the __nf_ct_resolve_clash function, but it is
> > still accessed in the tcf_ct_flow_table_process_conn function. If I
> > understand correctly, the original logic still adds the ct to the flow
> > table after resolving a clash once the skip_add is false. The chance
> > of encountering a drop case is rare because the skb's ct is already
> > substituted into the hashes one. However, if we still encounter a NULL
> > ct, the situation is unusual and might warrant dropping it as a
> > precaution. I am not an expert in this area and might have some
> > misunderstandings. Please share your opinions if you have any
> > concerns.
> >
>
> I'm also not an expert in this part of code. I understand the scenario
> of UAF found by KASAN analysis.
> My only concern is that the patch changes the flow of the function:
> in case of NF_ACCEPT we will go to "drop" instead of performing a normal
> flow.
>
> For example, if "nf_conntrack_confirm()" returns NF_ACCEPT, (even after
> the clash resolving), I would not expect calling "goto drop".
> That is why I suggested a less invasive solution which is just blocking
> calling "tcf_ct_flow_table_process_conn()" where there is a risk of UAF.
> So, I asked if such solution would work in case of this function.
Thank you for expressing your concerns in detail.
In my humble opinion, skipping the addition of an entry in the flow
table is controlled by other logic and may not be suitable to mix with
error handling. If nf_conntrack_confirm returns NF_ACCEPT, I believe
there is no reason for nf_ct_get to fail. The nf_ct_get function
simply converts skb->_nfct into a struct nf_conn type. The only
instance it might fail is when CONFIG_NF_CONNTRACK is disabled. The
CONFIG_NET_ACT_CT depends on this configuration and determines whether
act_ct.c needs to be compiled. Actually, the "goto drop" logic is
included for completeness and might only be relevant if the memory is
corrupted. Perhaps we could wrap the judgment with "unlikely" to
emphasize this point?
>
> Thanks,
> Michal
>
> > >
> > > It depends on what tc wants do to here.
> > >
> > > For netfilter, the skb is not dropped and continues passing
> > > through the stack. Its up to user to decide what to do with it,
> > > e.g. doing "ct state invalid drop".