Re: [PATCH v2 2/2] mm: wrap __find_buddy_pfn() with a necessary buddy page validation.

From: Zi Yan
Date: Fri Apr 01 2022 - 14:56:15 EST


On 1 Apr 2022, at 14:29, Linus Torvalds wrote:

> On Fri, Apr 1, 2022 at 11:11 AM Zi Yan <zi.yan@xxxxxxxx> wrote:
>>
>> Whenever the buddy of a page is found from __find_buddy_pfn(),
>> page_is_buddy() should be used to check its validity. Add a helper
>> function find_buddy_page_pfn() to find the buddy page and do the check
>> together.
>
> Well, this certainly looks nicer, except now:
>
>> +extern struct page *find_buddy_page_pfn(struct page *page, unsigned int order);
>
> that 'pfn' no longer makes sense in the name, and
>
>> @@ -1075,11 +1118,11 @@ static inline void __free_one_page(struct page *page,
>> migratetype);
>> return;
>> }
>> - buddy_pfn = __find_buddy_pfn(pfn, order);
>> - buddy = page + (buddy_pfn - pfn);
>>
>> - if (!page_is_buddy(page, buddy, order))
>> + buddy = find_buddy_page_pfn(page, order);
>> + if (!buddy)
>> goto done_merging;
>> + buddy_pfn = page_to_pfn(buddy);
>
> This case now does two "page_to_pfn()" calls (one inside
> find_buddy_page_pfn(), and one explicitly on the buddy).
>
> And those page_to_pfn() things can actually be fairly expensive. It
> *looks* like just a subtraction, but it's a pointer subtraction that
> can end up generating a divide by a non-power-of-two size.
>
> NORMALLY we try very hard to make 'sizeof struct page' be exactly 8
> words, but I do not believe this is actually guaranteed.
>
> And yeah, the divide-by-a-constant can be turned into a multiply, but
> even that is not necessarily always cheap.
>
> Now, two out of three use-cases didn't actually want the buddy_pfn(),
> but this one use-case does look like it might be performance-critical
> and a problem.

Yeah, I should have listened to your initial suggestion. Thanks for the
explanation.

How about the patch below? If it looks good, I will send v3.

diff --git a/mm/internal.h b/mm/internal.h
index 876e66237c89..754a666e5785 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -211,28 +211,8 @@ struct alloc_context {
bool spread_dirty_pages;
};

-/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- * B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- * P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
- */
-static inline unsigned long
-__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-{
- return page_pfn ^ (1 << order);
-}
+extern struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+ unsigned int order, unsigned long *buddy_pfn);

extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
unsigned long end_pfn, struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ea106146686..8195b4f64ec5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -998,6 +998,57 @@ static inline void del_page_from_free_list(struct page *page, struct zone *zone,
zone->free_area[order].nr_free--;
}

+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ * B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ * P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
+ */
+static inline unsigned long
+__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
+{
+ return page_pfn ^ (1 << order);
+}
+
+/*
+ * Find the buddy of @page and validate it.
+ * @page: The input page
+ * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
+ * function is used in the performance-critical __free_one_page().
+ * @order: The order of the page
+ * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
+ * page_to_pfn().
+ *
+ * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
+ * not the same as @page. The validation is necessary before use it.
+ *
+ * Return: the found buddy page or NULL if not found.
+ */
+struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+ unsigned int order, unsigned long *buddy_pfn)
+{
+ struct page *buddy;
+
+ *buddy_pfn = __find_buddy_pfn(pfn, order);
+ buddy = page + (*buddy_pfn - pfn);
+
+ if (page_is_buddy(page, buddy, order))
+ return buddy;
+ return NULL;
+}
+
/*
* If this is not the largest possible page, check if the buddy
* of the next-highest order is free. If it is, it's possible
@@ -1010,18 +1061,17 @@ static inline bool
buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
struct page *page, unsigned int order)
{
- struct page *higher_page, *higher_buddy;
- unsigned long combined_pfn;
+ struct page *higher_page;
+ unsigned long higher_page_pfn;

if (order >= MAX_ORDER - 2)
return false;

- combined_pfn = buddy_pfn & pfn;
- higher_page = page + (combined_pfn - pfn);
- buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1);
- higher_buddy = higher_page + (buddy_pfn - combined_pfn);
+ higher_page_pfn = buddy_pfn & pfn;
+ higher_page = page + (higher_page_pfn - pfn);

- return page_is_buddy(higher_page, higher_buddy, order + 1);
+ return find_buddy_page_pfn(higher_page, higher_page_pfn, order + 1,
+ &buddy_pfn) != NULL;
}

/*
@@ -1075,10 +1125,9 @@ static inline void __free_one_page(struct page *page,
migratetype);
return;
}
- buddy_pfn = __find_buddy_pfn(pfn, order);
- buddy = page + (buddy_pfn - pfn);

- if (!page_is_buddy(page, buddy, order))
+ buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
+ if (!buddy)
goto done_merging;

if (unlikely(order >= pageblock_order)) {
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f67c4c70f17f..4fbd27ba91c6 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -70,7 +70,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
unsigned long flags, nr_pages;
bool isolated_page = false;
unsigned int order;
- unsigned long pfn, buddy_pfn;
+ unsigned long buddy_pfn;
struct page *buddy;

zone = page_zone(page);
@@ -89,11 +89,9 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
if (PageBuddy(page)) {
order = buddy_order(page);
if (order >= pageblock_order && order < MAX_ORDER - 1) {
- pfn = page_to_pfn(page);
- buddy_pfn = __find_buddy_pfn(pfn, order);
- buddy = page + (buddy_pfn - pfn);
-
- if (!is_migrate_isolate_page(buddy)) {
+ buddy = find_buddy_page_pfn(page, page_to_pfn(page),
+ order, &buddy_pfn);
+ if (buddy && !is_migrate_isolate_page(buddy)) {
isolated_page = !!__isolate_free_page(page, order);
/*
* Isolating a free page in an isolated pageblock

--
Best Regards,
Yan, Zi

Attachment: signature.asc
Description: OpenPGP digital signature