Re: [PATCH v4] Add /proc/PID/smaps support for DAX
From: Dave Hansen
Date: Thu Oct 26 2017 - 10:03:39 EST
I'm honestly not understanding what problem this solves. Could you,
perhaps, do a before and after of smaps with and without this patch?
> +/* page structure behind DAX mappings is NOT compound page
> + * when it's a huge page mappings, so introduce new API to
> + * account for both PMD and PUD mapping.
> + */
Why do they need to be compound? Why don't we just make them compound
instead of adding all this code which is *just* for DAX?
> +static void smaps_account_dax_huge(struct mem_size_stats *mss,
> + struct page *page, unsigned long size, bool young, bool dirty)
> +{
> + int mapcount = page_mapcount(page);
> +
> + if (PageAnon(page)) {
> + mss->anonymous += size;
> + if (!PageSwapBacked(page) && !dirty && !PageDirty(page))
> + mss->lazyfree += size;
> + }
How can you have DAX anonymous huge pages?
> + mss->resident += size;
> + /* Accumulate the size in pages that have been accessed. */
> + if (young || page_is_young(page) || PageReferenced(page))
> + mss->referenced += size;
Isn't this just a copy'n'paste of smaps_account() code?
> + /*
> + * page_count(page) == 1 guarantees the page is mapped exactly once.
> + * If any subpage of the compound page mapped with PTE it would elevate
> + * page_count().
> + */
> + if (page_count(page) == 1) {
> + if (dirty || PageDirty(page))
> + mss->private_dirty += size;
> + else
> + mss->private_clean += size;
> + mss->pss += (u64)size << PSS_SHIFT;
> + return;
> + }
PSS makes *zero* sense for DAX. The "memory" is used whether the
mapping exists or not.
Also, the idea of "private" doesn't really make sense here.
> + if (mapcount >= 2) {
> + if (dirty || PageDirty(page))
> + mss->shared_dirty += size;
> + else
> + mss->shared_clean += size;
> + mss->pss += (size << PSS_SHIFT) / mapcount;
> + } else {
> + if (dirty || PageDirty(page))
> + mss->private_dirty += size;
> + else
> + mss->private_clean += size;
> + mss->pss += size << PSS_SHIFT;
> + }
> +}
> +
> #ifdef CONFIG_SHMEM
> static int smaps_pte_hole(unsigned long addr, unsigned long end,
> struct mm_walk *walk)
> @@ -528,7 +577,16 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
> struct page *page = NULL;
>
> if (pte_present(*pte)) {
> - page = vm_normal_page(vma, addr, *pte);
> + if (!vma_is_dax(vma))
> + page = vm_normal_page(vma, addr, *pte);
> + else if (pte_devmap(*pte)) {
> + struct dev_pagemap *pgmap;
> +
> + pgmap = get_dev_pagemap(pte_pfn(*pte), NULL);
> + if (!pgmap)
> + return;
> + page = pte_page(*pte);
> + }
> } else if (is_swap_pte(*pte)) {
> swp_entry_t swpent = pte_to_swp_entry(*pte);
>
> @@ -579,7 +637,19 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
> struct page *page;
>
> /* FOLL_DUMP will return -EFAULT on huge zero page */
> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP);
> + if (!vma_is_dax(vma))
> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP);
> + else if (pmd_devmap(*pmd)) {
> + struct dev_pagemap *pgmap;
> +
> + pgmap = get_dev_pagemap(pmd_pfn(*pmd), NULL);
> + if (!pgmap)
> + return;
> + page = pmd_page(*pmd);
> + smaps_account_dax_huge(mss, page, PMD_SIZE, pmd_young(*pmd),
> + pmd_dirty(*pmd));
> + return;
> + }
> if (IS_ERR_OR_NULL(page))
> return;
> if (PageAnon(page))
>
There's a fair amount of copying and pasting going on here. There is,
again, a bunch of specialized DAX code. Isn't there a way to do this
more generically?