Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets

From: Ilias Apalodimas
Date: Thu Jul 15 2021 - 11:03:03 EST


On Thu, Jul 15, 2021 at 07:57:57AM -0700, Alexander Duyck wrote:
> On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
> <ilias.apalodimas@xxxxxxxxxx> wrote:
> >
> > > > > atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >
> > [...]
> >
> > > > > &shinfo->dataref))
> > > > > - return;
> > > > > + goto exit;
> > > >
> > > > Is it possible this patch may break the head frag page for the original skb,
> > > > supposing it's head frag page is from the page pool and below change clears
> > > > the pp_recycle for original skb, causing a page leaking for the page pool?
> > >
> > > I don't see how. The assumption here is that when atomic_sub_return
> > > gets down to 0 we will still have an skb with skb->pp_recycle set and
> > > it will flow down and encounter skb_free_head below. All we are doing
> > > is skipping those steps and clearing skb->pp_recycle for all but the
> > > last buffer and the last one to free it will trigger the recycling.
> >
> > I think the assumption here is that
> > 1. We clone an skb
> > 2. The original skb goes into pskb_expand_head()
> > 3. skb_release_data() will be called for the original skb
> >
> > But with the dataref bumped, we'll skip the recycling for it but we'll also
> > skip recycling or unmapping the current head (which is a page_pool mapped
> > buffer)
>
> Right, but in that case it is the clone that is left holding the
> original head and the skb->pp_recycle flag is set on the clone as it
> was copied from the original when we cloned it.

Ah yes, that's what I missed

> What we have
> essentially done is transferred the responsibility for freeing it from
> the original to the clone.
>
> If you think about it the result is the same as if step 2 was to go
> into kfree_skb. We would still be calling skb_release_data and the
> dataref would be decremented without the original freeing the page. We
> have to wait until all the clones are freed and dataref reaches 0
> before the head can be recycled.

Yep sounds correct

Thanks
/Ilias