Re: [RFC PATCH 4/4] x86/mm: write protect (most) page tables

From: David Hildenbrand
Date: Wed Aug 25 2021 - 04:38:29 EST


On 24.08.21 01:50, Dave Hansen wrote:
On 8/23/21 6:25 AM, Mike Rapoport wrote:
void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
{
+ enable_pgtable_write(page_address(pte));
pgtable_pte_page_dtor(pte);
paravirt_release_pte(page_to_pfn(pte));
paravirt_tlb_remove_table(tlb, pte);
@@ -69,6 +73,7 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
#ifdef CONFIG_X86_PAE
tlb->need_flush_all = 1;
#endif
+ enable_pgtable_write(pmd);
pgtable_pmd_page_dtor(page);
paravirt_tlb_remove_table(tlb, page);
}

I would expected this to have leveraged the pte_offset_map/unmap() code
to enable/disable write access. Granted, it would enable write access
even when only a read is needed, but that could be trivially fixed with
having a variant like:

For write access you actually want pte_offset_map_locked(), but it's also used for stable read access sometimes (exclude any writers).


pte_offset_map_write()
pte_offset_unmap_write()

in addition to the existing (presumably read-only) versions:

pte_offset_map()
pte_offset_unmap()

These should mostly only be read access, you'd need other ways of making sure nobody else messes with that entry. I think it even holds for khugepaged collapsing code.


I find these hidden PMD entry modifications (e.g., without holding the PMD lock) deep down in arch code quite concerning. Read: horribly ugly and a nightmare to debug.

--
Thanks,

David / dhildenb