Re: [PATCH v2] page_alloc: Fix freeing non-compound pages
From: Matthew Wilcox
Date: Mon Sep 28 2020 - 21:17:25 EST
On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> wrote:
>
> > Here is a very rare race which leaks memory:
>
> Not worth a cc:stable?
Yes, it probably should have been. I just assume the stablebot will
pick up anything that has a Fixes: tag.
> > Page P0 is allocated to the page cache. Page P1 is free.
> >
> > Thread A Thread B Thread C
> > find_get_entry():
> > xas_load() returns P0
> > Removes P0 from page cache
> > P0 finds its buddy P1
> > alloc_pages(GFP_KERNEL, 1) returns P0
> > P0 has refcount 1
> > page_cache_get_speculative(P0)
> > P0 has refcount 2
> > __free_pages(P0)
>
> __free_pages(P0, 1), I assume.
Good catch. That was what I meant to type.
> > P0 has refcount 1
> > put_page(P0)
>
> but this is implicitly order 0
Right, because it's not a compound page.
> > P1 is not freed
>
> huh.
Yeah. Nasty, and we'll never know how often it was hit.
> > Fix this by freeing all the pages in __free_pages() that won't be freed
> > by the call to put_page(). It's usually not a good idea to split a page,
> > but this is a very unlikely scenario.
> >
> > ...
> >
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
> > {
> > if (put_page_testzero(page))
> > free_the_page(page, order);
> > + else if (!PageHead(page))
> > + while (order-- > 0)
> > + free_the_page(page + (1 << order), order);
>
> Well that's weird and scary looking. `page' has non-zero refcount yet
> we go and free random followon pages. Methinks it merits an
> explanatory comment?
Well, poot. I lost that comment in the shuffling of patches. In a
different tree, I have:
@@ -4943,10 +4943,19 @@ static inline void free_the_page(struct page *page, unsi
gned int order)
__free_pages_ok(page, order);
}
+/*
+ * If we free a non-compound allocation, another thread may have a
+ * speculative reference to the first page. It has no way of knowing
+ * about the rest of the allocation, so we have to free all but the
+ * first page here.
+ */
void __free_pages(struct page *page, unsigned int order)
{
if (put_page_testzero(page))
free_the_page(page, order);
+ else if (!PageHead(page))
+ while (order-- > 0)
+ free_the_page(page + (1 << order), order);
}
EXPORT_SYMBOL(__free_pages);
Although I'm now thinking of making that comment into kernel-doc and
turning it into advice to the caller rather than an internal note to
other mm developers.