Re: [PATCH 00/12] mm/debug_vm_pgtable: Enhancements

From: Anshuman Khandual
Date: Mon Jul 12 2021 - 00:14:03 EST


Hi Gavin,

Though I have not jumped into the details for all individual
patches here but still have some high level questions below.

On 7/6/21 11:47 AM, Gavin Shan wrote:
> There are couple of issues with current implementations and this series
> tries to resolve the issues:
>
> (a) All needed information are scattered in variables, passed to various
> test functions. The code is organized in pretty much relaxed fashion.
All these variables are first prepared in debug_vm_pgtable(), before
getting passed into respective individual test functions. Also these
test functions receive only the required number of variables not all.
Adding a structure that captures all test parameters at once before
passing them down will be unnecessary. I am still wondering what will
be the real benefit of this large code churn ?

>
> (b) The page isn't allocated from buddy during page table entry modifying
> tests. The page can be invalid, conflicting to the implementations
> of set_{pud, pmd, pte}_at() on ARM64. The target page is accessed
> so that the iCache can be flushed when execution permission is given
> on ARM64. Besides, the target page can be unmapped and access to
> it causes kernel crash.

Using 'start_kernel' based method for struct page usage, enabled this
test to run on platforms which might not have enough memory required
for various individual test functions. This method is not a problem for
tests that just need an aligned pfn (which creates a page table entry)
not a real struct page.

But not allocating and owning the struct page might be problematic for
tests that expect a real struct page and transform its state via set_
{pud, pmd, pte}_at() functions as reported here.

>
> "struct vm_pgtable_debug" is introduced to address issue (a). For issue
> (b), the used page is allocated from buddy in page table entry modifying
> tests. The corresponding tets will be skipped if we fail to allocate the
> (huge) page. For other test cases, the original page around to kernel
> symbol (@start_kernel) is still used.

For all basic pfn requiring tests, existing 'start_kernel' based method
should continue but allocate a struct page for other tests which change
the passed struct page. Skipping the tests when allocation fails is the
right thing to do.

- Anshuman