Re: [PATCH v3 1/2] mm: Abstract THP allocation

From: Dev Jain
Date: Wed Sep 11 2024 - 09:16:36 EST



On 9/11/24 18:44, David Hildenbrand wrote:
On 11.09.24 15:05, Dev Jain wrote:

On 9/11/24 18:30, David Hildenbrand wrote:
On 11.09.24 14:53, Dev Jain wrote:

On 9/11/24 14:57, David Hildenbrand wrote:
On 11.09.24 08:55, Dev Jain wrote:
In preparation for the second patch, abstract away the THP allocation
logic present in the create_huge_pmd() path, which corresponds to the
faulting case when no page is present.

There should be no functional change as a result of applying
this patch.


[...]

+
+static void map_pmd_thp(struct folio *folio, struct vm_fault *vmf,
+            struct vm_area_struct *vma, unsigned long haddr)
+{
+    pmd_t entry;
+
+    entry = mk_huge_pmd(&folio->page, vma->vm_page_prot);
+    entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
+    folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE);
+    folio_add_lru_vma(folio, vma);
+    set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry);
+    update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);

It's quite weird to see a mixture of haddr and vmf->address, and
likely this mixture is wrong or not not required.

Looking at arc's update_mmu_cache_pmd() implementation, I cannot see
how passing in the unaligned address would do the right thing. But
maybe arc also doesn't trigger that code path ... who knows :)

If I am reading correctly, arch/arc/mm/tlb.c: update_mmu_cache_pmd()
calls update_mmu_cache_range() which is already expecting an unaligned
address? But...

So update_mmu_cache_pmd() calls

     update_mmu_cache_range(NULL, vma, addr, &pte, HPAGE_PMD_NR);

But update_mmu_cache_range() only aligns it *to page boundary*:

     unsigned long vaddr = vaddr_unaligned & PAGE_MASK;

Ah, totally missed that it was PAGE_MASK. Thanks.

We obtain the correct hugepage-aligned physical address from the PTE

     phys_addr_t paddr = pte_val(*ptep) & PAGE_MASK_PHYS;

Then, we look at the offset in our folio

     unsigned long offset = offset_in_folio(folio, paddr);

And adjust both vaddr and paddr

     paddr -= offset;
     vaddr -= offset;

To then use that combination with

     __inv_icache_pages(paddr, vaddr, nr);

If I am not wrong, getting a non-hugepage aligned vaddr messes up
things here. But only regarding the icache I think.

Looks like it...

As we are adding a fresh page where there previously wasn't anything mapped (no icache invaldiation required?), and because most anon mappings are not executable, maybe that's why nobody notices so far.

Thanks for the observation!