Re: [PATCH net] udp: fix segmentation crash for GRO packet without fraglist
From: Willem de Bruijn
Date: Fri Apr 19 2024 - 10:17:26 EST
Lena Wang (王娜) wrote:
> On Wed, 2024-04-17 at 21:15 -0700, Maciej Żenczykowski wrote:
> >
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> > On Wed, Apr 17, 2024 at 7:53 PM Lena Wang (王娜) <
> > Lena.Wang@xxxxxxxxxxxx> wrote:
> > >
> > > On Wed, 2024-04-17 at 15:48 -0400, Willem de Bruijn wrote:
> > > >
> > > > External email : Please do not click links or open attachments
> > until
> > > > you have verified the sender or the content.
> > > > Lena Wang (王娜) wrote:
> > > > > On Tue, 2024-04-16 at 19:14 -0400, Willem de Bruijn wrote:
> > > > > >
> > > > > > External email : Please do not click links or open
> > attachments
> > > > until
> > > > > > you have verified the sender or the content.
> > > > > > > > > > Personally, I think bpf_skb_pull_data() should have
> > > > > > automatically
> > > > > > > > > > (ie. in kernel code) reduced how much it pulls so
> > that it
> > > > > > would pull
> > > > > > > > > > headers only,
> > > > > > > > >
> > > > > > > > > That would be a helper that parses headers to discover
> > > > header
> > > > > > length.
> > > > > > > >
> > > > > > > > Does it actually need to? Presumably the bpf pull
> > function
> > > > could
> > > > > > > > notice that it is
> > > > > > > > a packet flagged as being of type X (UDP GSO FRAGLIST)
> > and
> > > > reduce
> > > > > > the pull
> > > > > > > > accordingly so that it doesn't pull anything from the
> > non-
> > > > linear
> > > > > > > > fraglist portion???
> > > > > > > >
> > > > > > > > I know only the generic overview of what udp gso is, not
> > any
> > > > > > details, so I am
> > > > > > > > assuming here that there's some sort of guarantee to how
> > > > these
> > > > > > packets
> > > > > > > > are structured... But I imagine there must be or we
> > wouldn't
> > > > be
> > > > > > hitting these
> > > > > > > > issues deeper in the stack?
> > > > > > >
> > > > > > > Perhaps for a packet of this type we're already guaranteed
> > the
> > > > > > headers
> > > > > > > are in the linear portion,
> > > > > > > and the pull should simply be ignored?
> > > > > > >
> > > > > > > >
> > > > > > > > > Parsing is better left to the BPF program.
> > > > > >
> > > > > > I do prefer adding sanity checks to the BPF helpers, over
> > having
> > > > to
> > > > > > add then in the net hot path only to protect against
> > dangerous
> > > > BPF
> > > > > > programs.
> > > > > >
> > > > > Is it OK to ignore or decrease pull length for udp gro fraglist
> > > > packet?
> > > > > It could save the normal packet and sent to user correctly.
> > > > >
> > > > > In common/net/core/filter.c
> > > > > static inline int __bpf_try_make_writable(struct sk_buff *skb,
> > > > > unsigned int write_len)
> > > > > {
> > > > > +if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > > > +(SKB_GSO_UDP |SKB_GSO_UDP_L4)) {
> > > >
> > > > The issue is not with SKB_GSO_UDP_L4, but with SKB_GSO_FRAGLIST.
> > > >
> > > Current in kernel just UDP uses SKB_GSO_FRAGLIST to do GRO. In
> > > udp_offload.c udp4_gro_complete gso_type adds "SKB_GSO_FRAGLIST|
> > > SKB_GSO_UDP_L4". Here checking these two flags is to limit the
> > packet
> > > as "UDP + need GSO + fraglist".
> > >
> > > We could remove SKB_GSO_UDP_L4 check for more packet that may
> > addrive
> > > skb_segment_list.
> > >
> > > > > +return 0;
> > > >
> > > > Failing for any pull is a bit excessive. And would kill a sane
> > > > workaround of pulling only as many bytes as needed.
> > > >
> > > > > + or if (write_len > skb_headlen(skb))
> > > > > +write_len = skb_headlen(skb);
> > > >
> > > > Truncating requests would be a surprising change of behavior
> > > > for this function.
> > > >
> > > > Failing for a pull > skb_headlen is arguably reasonable, as
> > > > the alternative is that we let it go through but have to drop
> > > > the now malformed packets on segmentation.
> > > >
> > > >
> > > Is it OK as below?
> > >
> > > In common/net/core/filter.c
> > > static inline int __bpf_try_make_writable(struct sk_buff *skb,
> > > unsigned int write_len)
> > > {
> > > + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > + SKB_GSO_FRAGLIST) && (write_len >
> > skb_headlen(skb))) {
> > > + return 0;
> >
> > please limit write_len to skb_headlen() instead of just returning 0
> >
>
> Hi Maze & Willem,
> Maze's advice is:
> In common/net/core/filter.c
> static inline int __bpf_try_make_writable(struct sk_buff *skb,
> unsigned int write_len)
> {
> + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> + SKB_GSO_FRAGLIST) && (write_len > skb_headlen(skb))) {
> + write_len = skb_headlen(skb);
> + }
> return skb_ensure_writable(skb, write_len);
> }
>
> Willem's advice is to "Failing for a pull > skb_headlen is arguably
> reasonable...". It prefers to return 0 :
> + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> + SKB_GSO_FRAGLIST) && (write_len > skb_headlen(skb))) {
> + return 0;
> + }
>
> It seems a bit conflict. However I am not sure if my understanding is
> right and hope to get your further guide.
I did not mean to return 0. But to fail a request that would pull an
unsafe amount. The caller must get a clear error signal.
Back to the original report: the issue should already have been fixed
by commit 876e8ca83667 ("net: fix NULL pointer in skb_segment_list").
But that commit is in the kernel for which you report the error.
Turns out that the crash is not in skb_segment_list, but later in
__udpv4_gso_segment_list_csum. Which unconditionally dereferences
udp_hdr(seg).
The above fix also mentions skb pull as the culprit, but does not
include a BPF program. If this can be reached in other ways, then we
do need a stronger test in skb_segment_list, as you propose.
I don't want to narrowly check whether udp_hdr is safe. Essentially,
an SKB_GSO_FRAGLIST skb layout cannot be trusted at all if even one
byte would get pulled.