Re: [PATCH v2 01/12] mm/debug_vm_pgtable: Introduce struct pgtable_debug_args

From: Gavin Shan
Date: Wed Jul 21 2021 - 08:10:01 EST


Hi Anshuman,

On 7/21/21 2:50 PM, Anshuman Khandual wrote:
On 7/21/21 4:59 AM, Gavin Shan wrote:
On 7/20/21 4:42 PM, Anshuman Khandual wrote:
On 7/19/21 6:31 PM, Gavin Shan wrote:
+    if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
+        has_transparent_hugepage()) {
+        page = alloc_pages(GFP_KERNEL, HPAGE_PMD_ORDER);
+        if (page) {
+            args->pmd_pfn = page_to_pfn(page);
+            args->pte_pfn = args->pmd_pfn;
+            return 0;
+        }
+    }
+

As syzbot reported against v1 series, we could allocate pages larger than (1 << (MAX_ORDER - 1)) here.
So __GFP_NOWARN is needed here. I will fix it in v3 series.

I could find the following build error reported from lkp on V2.

mm/debug_vm_pgtable.c:445:8: warning: variable 'pud' set but not used [-Wunused-but-set-variable]


Yes, The following line is missed in PATCH[v2 09/12] and fixed in
PATCH[v3 09/12]: WARN_ON(!pud_none(pud)). With this line added,
the variable @pud is used in v3.

Could you please point to the syzbot reported problem on V1 as you
have mentioned above. Are there configs where HPAGE_[PMD|PUD]_ORDER
is greater than (MAX_ORDER - 1) ? If yes, how adding __GFP_NOWARN
solves the problem ?


https://syzkaller.appspot.com/bug?extid=8730ec44a441a434a2c8
https://syzkaller.appspot.com/x/.config?x=29a82c885e192046

The kernel config has the following options:

CONFIG_X86_64=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
#define PUD_SHIFT        30
#define PMD_SHIFT        21

CONFIG_FORCE_MAX_ZONEORDER=n
#define MAX_ORDER    11

(HPAGE_PUD_SHIFT - PAGE_SHIFT) >= (1 << MAX_ORDER)
(HPAGE_PMD_ORDER)              <  (1 << MAX_ORDER)

The warning is triggered in the following path, __GFP_NOWARN helps to
avoid the WARNING_ON_ONCE(), but NULL is returned as expected.

   alloc_pages
     __alloc_pages

       if (unlikely(order >= MAX_ORDER)) {
                WARN_ON_ONCE(!(gfp & __GFP_NOWARN));
                return NULL;
        }

But then that does not allocate the PUD element for the test which
subsequently will be skipped. Isn't it ? So if the order is greater
than MAX_ORDER, allocation needs to happen via alloc_contig_pages()
or something similar.


Yes, the corresponding (modifying) tests will be skipped if we fail
to allocate the PUD huge page. And we need to use alloc_contig_pages()
when CONFIG_CONTIG_ALLOC is enabled. Otherwise, alloc_pages() is still
used as best-effort before we fail completely.

It's explained to you when we're discussion on PATCH[v3 01/12].

Thanks,
Gavin