Re: [PATCH net] udp: fix segmentation crash for GRO packet without fraglist

From: Lena Wang (王娜)
Date: Wed Apr 17 2024 - 03:20:22 EST


On Tue, 2024-04-16 at 19:14 -0400, Willem de Bruijn wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> > > > > Personally, I think bpf_skb_pull_data() should have
> automatically
> > > > > (ie. in kernel code) reduced how much it pulls so that it
> would pull
> > > > > headers only,
> > > >
> > > > That would be a helper that parses headers to discover header
> length.
> > >
> > > Does it actually need to? Presumably the bpf pull function could
> > > notice that it is
> > > a packet flagged as being of type X (UDP GSO FRAGLIST) and reduce
> the pull
> > > accordingly so that it doesn't pull anything from the non-linear
> > > fraglist portion???
> > >
> > > I know only the generic overview of what udp gso is, not any
> details, so I am
> > > assuming here that there's some sort of guarantee to how these
> packets
> > > are structured... But I imagine there must be or we wouldn't be
> hitting these
> > > issues deeper in the stack?
> >
> > Perhaps for a packet of this type we're already guaranteed the
> headers
> > are in the linear portion,
> > and the pull should simply be ignored?
> >
> > >
> > > > Parsing is better left to the BPF program.
>
> I do prefer adding sanity checks to the BPF helpers, over having to
> add then in the net hot path only to protect against dangerous BPF
> programs.
>
Is it OK to ignore or decrease pull length for udp gro fraglist packet?
It could save the normal packet and sent to user correctly.

In common/net/core/filter.c
static inline int __bpf_try_make_writable(struct sk_buff *skb,
unsigned int write_len)
{
+ if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
+ (SKB_GSO_UDP |SKB_GSO_UDP_L4)) {
+ return 0;

+ or if (write_len > skb_headlen(skb))
+ write_len = skb_headlen(skb);
+ }
return skb_ensure_writable(skb, write_len);
}


> In this case, it would be detecting this GSO type and failing the
> operation if exceeding skb_headlen().
> > > >
> > > > > and not packet content.
> > > > > (This is assuming the rest of the code isn't ready to deal
> with a longer pull,
> > > > > which I think is the case atm. Pulling too much, and then
> crashing or forcing
> > > > > the stack to drop packets because of them being malformed
> seems wrong...)
> > > > >
> > > > > In general it would be nice if there was a way to just say
> pull all headers...
> > > > > (or possibly all L2/L3/L4 headers)
> > > > > You in general need to pull stuff *before* you've even looked
> at the packet,
> > > > > so that you can look at the packet,
> > > > > so it's relatively hard/annoying to pull the correct length
> from bpf
> > > > > code itself.
> > > > >
> > > > > > > > BPF needs to modify a proper length to do pull data.
> However kernel
> > > > > > > > should also improve the flow to avoid crash from a bpf
> function
> > > > > > > call.
> > > > > > > > As there is no split flow and app may not decode the
> merged UDP
> > > > > > > packet,
> > > > > > > > we should drop the packet without fraglist in
> skb_segment_list
> > > > > > > here.
> > > > > > > >
> > > > > > > > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist
> chaining.")
> > > > > > > > Signed-off-by: Shiming Cheng <
> shiming.cheng@xxxxxxxxxxxx>
> > > > > > > > Signed-off-by: Lena Wang <lena.wang@xxxxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > net/core/skbuff.c | 3 +++
> > > > > > > > 1 file changed, 3 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > > > > > index b99127712e67..f68f2679b086 100644
> > > > > > > > --- a/net/core/skbuff.c
> > > > > > > > +++ b/net/core/skbuff.c
> > > > > > > > @@ -4504,6 +4504,9 @@ struct sk_buff
> *skb_segment_list(struct
> > > > > > > sk_buff *skb,
> > > > > > > > if (err)
> > > > > > > > goto err_linearize;
> > > > > > > >
> > > > > > > > +if (!list_skb)
> > > > > > > > +goto err_linearize;
> > > > > > > > +
> > > >
> > > > This would catch the case where the entire data frag_list is
> > > > linearized, but not a pskb_may_pull that only pulls in part of
> the
> > > > list.
> > > >
> > > > Even with BPF being privileged, the kernel should not crash if
> BPF
> > > > pulls a FRAGLIST GSO skb.
> > > >
> > > > But the check needs to be refined a bit. For a UDP GSO packet,
> I
> > > > think gso_size is still valid, so if the head_skb length does
> not
> > > > match gso_size, it has been messed with and should be dropped.
> > > >
Is it OK as below? Is it OK to add log to record the error for easy
checking issue.

In net/core/skbuff.c skb_segment_list
+unsigned int mss = skb_shinfo(head_skb)->gso_size;
+bool err_len = false;

+if ( mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb)) {
+ pr_err("skb is dropped due to messed data. gso size:%d,
+ hdrlen:%d", mss, skb_headlen(head_skb)
+ if (!list_skb)
+ goto err_linearize;
+ else
+ err_len = true;
+}

...
+if (err_len) {
+ goto err_linearize;
+}

skb_get(skb);
...

> > > > For a GSO_BY_FRAGS skb, there is no single gso_size, and this
> pull
> > > > may be entirely undetectable as long as frag_list != NULL?
> > > >
> > > >
In function skb_segment_list(), it just handle udp fraglist gro packet.
nr_frags will be 0 here.

It records a SKB_GSO_DODGY in gso_type when doing partially eaten for
fraglist in __pskb_pull_tail and in skb_segment() it will check and
disable NETIF_F_SG.
skb_segment could segment data as gso_size even if it is pulled into
hearder skb. I am not sure if it can decode when frag_list is NULL or
partially eaten as no BPF pulls illegal length for tcp packet. Our
platfrom doesn't meet issues in skb_segment for tcp packet till now.

> > > > > > > > skb_shinfo(skb)->frag_list = NULL;
> > > > > > >
> > > > > > > In absense of plugging the issue in BPF, dropping here is
> the best
> > > > > > > we can do indeed, I think.
> >
> > --
> > Maciej Żenczykowski, Kernel Networking Developer @ Google
>
>