Re: [RFC PATCH] net: Fix one page_pool page leak from skb_frag_unref

From: Mina Almasry
Date: Sat Apr 27 2024 - 00:24:34 EST


On Fri, Apr 26, 2024 at 4:09 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Thu, 25 Apr 2024 12:20:59 -0700 Mina Almasry wrote:
> > - if (recycle && napi_pp_get_page(page))
> > + if (napi_pp_get_page(page))
>
> Pretty sure you can't do that. The "recycle" here is a concurrency
> guarantee. A guarantee someone is holding a pp ref on that page,
> a ref which will not go away while napi_pp_get_page() is executing.

I don't mean to argue, but I think the get_page()/put_page() pair we
do in the page ref path is susceptible to the same issue. AFAIU it's
not safe to get_page() if another CPU can be dropping the last ref,
get_page_unless_zero() should be used instead.

Since get_page() is good in the page ref path without some guarantee,
it's not obvious to me why we need this guarantee in the pp ref path,
but I could be missing some subtlety. At any rate, if you prefer us
going down the road of reverting commit 2cc3aeb5eccc ("skbuff: Fix a
potential race while recycling page_pool packets"), I think that could
also fix the issue.

--
Thanks,
Mina