Re: [PATCH net-next v3] skbuff: Introduce slab_build_skb()

From: Vlastimil Babka
Date: Thu Dec 08 2022 - 06:10:45 EST


On 12/8/22 11:19, Feng Tang wrote:
> On Thu, Dec 08, 2022 at 09:13:41AM +0100, Vlastimil Babka wrote:
>> On 12/8/22 07:02, Kees Cook wrote:
>> > syzkaller reported:
>> >
>> > BUG: KASAN: slab-out-of-bounds in __build_skb_around+0x235/0x340 net/core/skbuff.c:294
>> > Write of size 32 at addr ffff88802aa172c0 by task syz-executor413/5295
>> >
>> > For bpf_prog_test_run_skb(), which uses a kmalloc()ed buffer passed to
>> > build_skb().
>> >
>> > When build_skb() is passed a frag_size of 0, it means the buffer came
>> > from kmalloc. In these cases, ksize() is used to find its actual size,
>> > but since the allocation may not have been made to that size, actually
>> > perform the krealloc() call so that all the associated buffer size
>> > checking will be correctly notified (and use the "new" pointer so that
>> > compiler hinting works correctly). Split this logic out into a new
>> > interface, slab_build_skb(), but leave the original 0 checking for now
>> > to catch any stragglers.
>> >
>> > Reported-by: syzbot+fda18eaa8c12534ccb3b@xxxxxxxxxxxxxxxxxxxxxxxxx
>> > Link: https://groups.google.com/g/syzkaller-bugs/c/UnIKxTtU5-0/m/-wbXinkgAQAJ
>> > Fixes: 38931d8989b5 ("mm: Make ksize() a reporting-only function")
>> > Cc: Jakub Kicinski <kuba@xxxxxxxxxx>
>> > Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
>> > Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
>> > Cc: Paolo Abeni <pabeni@xxxxxxxxxx>
>> > Cc: Pavel Begunkov <asml.silence@xxxxxxxxx>
>> > Cc: pepsipu <soopthegoop@xxxxxxxxx>
>> > Cc: syzbot+fda18eaa8c12534ccb3b@xxxxxxxxxxxxxxxxxxxxxxxxx
>> > Cc: Vlastimil Babka <vbabka@xxxxxxx>
>> > Cc: kasan-dev <kasan-dev@xxxxxxxxxxxxxxxx>
>> > Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
>> > Cc: ast@xxxxxxxxxx
>> > Cc: bpf <bpf@xxxxxxxxxxxxxxx>
>> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
>> > Cc: Hao Luo <haoluo@xxxxxxxxxx>
>> > Cc: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>> > Cc: John Fastabend <john.fastabend@xxxxxxxxx>
>> > Cc: jolsa@xxxxxxxxxx
>> > Cc: KP Singh <kpsingh@xxxxxxxxxx>
>> > Cc: martin.lau@xxxxxxxxx
>> > Cc: Stanislav Fomichev <sdf@xxxxxxxxxx>
>> > Cc: song@xxxxxxxxxx
>> > Cc: Yonghong Song <yhs@xxxxxx>
>> > Cc: netdev@xxxxxxxxxxxxxxx
>> > Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
>> > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
>> > ---
>> > v3:
>> > - make sure "resized" is passed back so compiler hints survive
>> > - update kerndoc (kuba)
>> > v2: https://lore.kernel.org/lkml/20221208000209.gonna.368-kees@xxxxxxxxxx
>> > v1: https://lore.kernel.org/netdev/20221206231659.never.929-kees@xxxxxxxxxx/
>> > ---
>> > drivers/net/ethernet/broadcom/bnx2.c | 2 +-
>> > drivers/net/ethernet/qlogic/qed/qed_ll2.c | 2 +-
>> > include/linux/skbuff.h | 1 +
>> > net/bpf/test_run.c | 2 +-
>> > net/core/skbuff.c | 70 ++++++++++++++++++++---
>> > 5 files changed, 66 insertions(+), 11 deletions(-)
>> >
>> > diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
>> > index fec57f1982c8..b2230a4a2086 100644
>> > --- a/drivers/net/ethernet/broadcom/bnx2.c
>> > +++ b/drivers/net/ethernet/broadcom/bnx2.c
>> > @@ -3045,7 +3045,7 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u8 *data,
>> >
>> > dma_unmap_single(&bp->pdev->dev, dma_addr, bp->rx_buf_use_size,
>> > DMA_FROM_DEVICE);
>> > - skb = build_skb(data, 0);
>> > + skb = slab_build_skb(data);
>> > if (!skb) {
>> > kfree(data);
>> > goto error;
>> > diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> > index ed274f033626..e5116a86cfbc 100644
>> > --- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> > +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> > @@ -200,7 +200,7 @@ static void qed_ll2b_complete_rx_packet(void *cxt,
>> > dma_unmap_single(&cdev->pdev->dev, buffer->phys_addr,
>> > cdev->ll2->rx_size, DMA_FROM_DEVICE);
>> >
>> > - skb = build_skb(buffer->data, 0);
>> > + skb = slab_build_skb(buffer->data);
>> > if (!skb) {
>> > DP_INFO(cdev, "Failed to build SKB\n");
>> > kfree(buffer->data);
>> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> > index 7be5bb4c94b6..0b391b635430 100644
>> > --- a/include/linux/skbuff.h
>> > +++ b/include/linux/skbuff.h
>> > @@ -1253,6 +1253,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb,
>> > void skb_attempt_defer_free(struct sk_buff *skb);
>> >
>> > struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
>> > +struct sk_buff *slab_build_skb(void *data);
>> >
>> > /**
>> > * alloc_skb - allocate a network buffer
>> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>> > index 13d578ce2a09..611b1f4082cf 100644
>> > --- a/net/bpf/test_run.c
>> > +++ b/net/bpf/test_run.c
>> > @@ -1130,7 +1130,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>> > }
>> > sock_init_data(NULL, sk);
>> >
>> > - skb = build_skb(data, 0);
>> > + skb = slab_build_skb(data);
>> > if (!skb) {
>> > kfree(data);
>> > kfree(ctx);
>> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> > index 1d9719e72f9d..ae5a6f7db37b 100644
>> > --- a/net/core/skbuff.c
>> > +++ b/net/core/skbuff.c
>> > @@ -269,12 +269,10 @@ static struct sk_buff *napi_skb_cache_get(void)
>> > return skb;
>> > }
>> >
>> > -/* Caller must provide SKB that is memset cleared */
>> > -static void __build_skb_around(struct sk_buff *skb, void *data,
>> > - unsigned int frag_size)
>> > +static inline void __finalize_skb_around(struct sk_buff *skb, void *data,
>> > + unsigned int size)
>> > {
>> > struct skb_shared_info *shinfo;
>> > - unsigned int size = frag_size ? : ksize(data);
>> >
>> > size -= SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>> >
>> > @@ -296,15 +294,71 @@ static void __build_skb_around(struct sk_buff *skb, void *data,
>> > skb_set_kcov_handle(skb, kcov_common_handle());
>> > }
>> >
>> > +static inline void *__slab_build_skb(struct sk_buff *skb, void *data,
>> > + unsigned int *size)
>> > +{
>> > + void *resized;
>> > +
>> > + /* Must find the allocation size (and grow it to match). */
>> > + *size = ksize(data);
>> > + /* krealloc() will immediately return "data" when
>> > + * "ksize(data)" is requested: it is the existing upper
>> > + * bounds. As a result, GFP_ATOMIC will be ignored. Note
>> > + * that this "new" pointer needs to be passed back to the
>> > + * caller for use so the __alloc_size hinting will be
>> > + * tracked correctly.
>> > + */
>> > + resized = krealloc(data, *size, GFP_ATOMIC);
>>
>> Hmm, I just realized, this trick will probably break the new kmalloc size
>> tracking from Feng Tang (CC'd)? We need to make krealloc() update the stored
>> size, right? And even worse if slab_debug redzoning is enabled and after
>> commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated
>> kmalloc space than requested") where the lack of update will result in
>> redzone check failures.
>
> I think it's still safe, as currently we skip the kmalloc redzone check
> by calling skip_orig_size_check() inside __ksize(). But as we have plan

Ah, right, I forgot. So that's good.

> to remove this skip_orig_size_check() after all ksize() usage has been
> sanitized, we need to cover this krealloc() case.

Yeah, can be done as part of the removal then, thanks.

> Thanks,
> Feng