Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect

From: Catalin Marinas
Date: Mon Nov 30 2020 - 06:02:51 EST


On Mon, Nov 30, 2020 at 04:28:20PM +0530, Anshuman Khandual wrote:
> On 11/30/20 3:08 PM, Catalin Marinas wrote:
> > On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote:
> >> On 11/27/20 3:14 PM, Catalin Marinas wrote:
> >>> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
> >>>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
> >>>>> This adds validation tests for dirtiness after write protect conversion for
> >>>>> each page table level. This is important for platforms such as arm64 that
> >>>>> removes the hardware dirty bit while making it an write protected one. This
> >>>>> also fixes pxx_wrprotect() related typos in the documentation file.
> >>>>
> >>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> >>>>> index c05d9dcf7891..a5be11210597 100644
> >>>>> --- a/mm/debug_vm_pgtable.c
> >>>>> +++ b/mm/debug_vm_pgtable.c
> >>>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> >>>>> WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
> >>>>> WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
> >>>>> WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
> >>>>> + WARN_ON(pte_dirty(pte_wrprotect(pte)));
> >>>>
> >>>> Wondering what you are testing here exactly.
> >>>>
> >>>> Do you expect that if PTE has the dirty bit, it gets cleared by
> >>>> pte_wrprotect() ?
> >>>>
> >>>> Powerpc doesn't do that, it only clears the RW bit but the dirty
> >>>> bit remains if it is set, until you call pte_mkclean() explicitely.
> >>>
> >>> Arm64 has an unusual way of setting a hardware dirty "bit", it actually
> >>> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
> >>> back and we can lose the dirty information. Will found this and posted
> >>> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
> >>> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
> >>> was that we may inadvertently make a fresh/clean pte dirty with such
> >>> change, hence the suggestion for the test.
> >>>
> >>> That said, I think we also need a test in the other direction,
> >>> pte_wrprotect() should preserve any dirty information:
> >>>
> >>> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
> >>
> >> This seems like a generic enough principle which all platforms should
> >> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte)))
> >> might fail on some platforms if the page table entry came in as a dirty
> >> one and pte_wrprotect() is not expected to alter the dirty state.
> >
> > Ah, so do we have architectures where entries in protection_map[] are
> > already dirty? If those are valid, maybe the check should be:
>
> Okay, I did not imply that actually. The current position for these new
> tests in respective pxx_basic_tests() functions is right at the end and
> hence the pxx might have already gone through some changes from the time
> it was originally created with pfn_pxx(). The entry here is not starting
> from the beginning. It is not expected as well, per design. So dirty bit
> might or might not be there depending on all the previous test sequences
> leading upto these new ones.
>
> IIUC, Christophe mentioned the fact that on platforms like powerpc, dirty
> bit just remains unchanged during pte_wprotect(). So the current test
> WARN_ON(pte_dirty(pte_wrprotect(pte))) will not work on powerpc if the
> previous tests leading upto that point has got the dirty bit set. This is
> irrespective of how it was created with pfn_pte() from protection_map[]
> originally at the beginning.
[...]
> To achieve this, we could move the test right at the beginning just after
> the pxx gets created from protection_map[], with a comment explaining the
> rationale.

OK, this makes sense. Thanks for the clarification.

--
Catalin