Re: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled

From: Kemeng Shi
Date: Wed Aug 30 2023 - 02:28:41 EST




on 8/28/2023 11:21 PM, Naoya Horiguchi wrote:
> On Sat, Aug 26, 2023 at 11:47:43PM +0800, Kemeng Shi wrote:
>> When guard page debug is enabled and set_page_guard returns success, we
>> miss to forward page to point to start of next split range and we will do
>> split unexpectedly in page range without target page. Move start page
>> update before set_page_guard to fix this.
>>
>> As we split to wrong target page, then splited pages are not able to merge
>> back to original order when target page is put back and splited pages
>> except target page is not usable. To be specific:
>>
>> Consider target page is the third page in buddy page with order 2.
>> | buddy-2 | Page | Target | Page |
>>
>> After break down to target page, we will only set first page to Guard
>> because of bug.
>> | Guard | Page | Target | Page |
>>
>> When we try put_page_back_buddy with target page, the buddy page of target
>> if neither guard nor buddy, Then it's not able to construct original page
>> with order 2
>> | Guard | Page | buddy-0 | Page |
>>
>> All pages except target page is not in free list and is not usable.
>>
>> Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
>> Signed-off-by: Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx>
>
> Thank you for finding the problem and writing patches. I think the patch
> fixes the reported problem, But I wonder that we really need guard page
> mechanism in break_down_buddy_pages() which is only called from memory_failure.
> As stated in Documentation/admin-guide/kernel-parameters.txt, this is a
> debugging feature to detect memory corruption due to buggy kernel or drivers
> code. So if HW memory failrue seems to be out of the scope, and I feel that
> we could simply remove it from break_down_buddy_pages().
>
> debug_guardpage_minorder=
> [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
> parameter allows control of the order of pages that will
> be intentionally kept free (and hence protected) by the
> buddy allocator. Bigger value increase the probability
> of catching random memory corruption, but reduce the
> amount of memory for normal system use. The maximum
> possible value is MAX_ORDER/2. Setting this parameter
> to 1 or 2 should be enough to identify most random
> memory corruption problems caused by bugs in kernel or
> driver code when a CPU writes to (or reads from) a
> random memory location. Note that there exists a class
> of memory corruptions problems caused by buggy H/W or
> F/W or by drivers badly programming DMA (basically when
> memory is written at bus level and the CPU MMU is
> bypassed) which are not detectable by
> CONFIG_DEBUG_PAGEALLOC, hence this option will not help
> tracking down these problems.
>
> If you have any idea about how guard page mechanism helps memory_failrue,
> could you share it?
>
Hi Naoya, thanks for feedback. Commit c0a32fc5a2e47 ("mm: more intensive
memory corruption debugging") menthioned we konw that with
CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception on
access (read,write) to an unallocated page, which permits us to catch code
which corrupts memory; Guard page aims to keep more free/protected pages
and to interlace free/protected and allocated pages to increase the
probability of catching corruption. Keep guard page around failrue looks
helpful to catch random access. Wish this can help.

> Thanks,
> Naoya Horiguchi
>
>> ---
>> mm/page_alloc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index fefc4074d9d0..88c5f5aea9b0 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6505,6 +6505,7 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
>> next_page = page;
>> current_buddy = page + size;
>> }
>> + page = next_page;
>>
>> if (set_page_guard(zone, current_buddy, high, migratetype))
>> continue;
>> @@ -6512,7 +6513,6 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
>> if (current_buddy != target) {
>> add_to_free_list(current_buddy, zone, high, migratetype);
>> set_buddy_order(current_buddy, high);
>> - page = next_page;
>> }
>> }
>> }
>> --
>> 2.30.0
>>
>>
>>
>