Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
From: Alexander Duyck
Date: Thu Jul 15 2021 - 10:58:18 EST
On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
<ilias.apalodimas@xxxxxxxxxx> wrote:
>
> > > > atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>
> [...]
>
> > > > &shinfo->dataref))
> > > > - return;
> > > > + goto exit;
> > >
> > > Is it possible this patch may break the head frag page for the original skb,
> > > supposing it's head frag page is from the page pool and below change clears
> > > the pp_recycle for original skb, causing a page leaking for the page pool?
> >
> > I don't see how. The assumption here is that when atomic_sub_return
> > gets down to 0 we will still have an skb with skb->pp_recycle set and
> > it will flow down and encounter skb_free_head below. All we are doing
> > is skipping those steps and clearing skb->pp_recycle for all but the
> > last buffer and the last one to free it will trigger the recycling.
>
> I think the assumption here is that
> 1. We clone an skb
> 2. The original skb goes into pskb_expand_head()
> 3. skb_release_data() will be called for the original skb
>
> But with the dataref bumped, we'll skip the recycling for it but we'll also
> skip recycling or unmapping the current head (which is a page_pool mapped
> buffer)
Right, but in that case it is the clone that is left holding the
original head and the skb->pp_recycle flag is set on the clone as it
was copied from the original when we cloned it. What we have
essentially done is transferred the responsibility for freeing it from
the original to the clone.
If you think about it the result is the same as if step 2 was to go
into kfree_skb. We would still be calling skb_release_data and the
dataref would be decremented without the original freeing the page. We
have to wait until all the clones are freed and dataref reaches 0
before the head can be recycled.