Re: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present

From: Thomas HellstrÃm (VMware)
Date: Thu Oct 10 2019 - 02:15:42 EST

On 10/10/19 4:07 AM, Linus Torvalds wrote:
On Wed, Oct 9, 2019 at 6:10 PM Thomas HellstrÃm (VMware)
<thomas_os@xxxxxxxxxxxx> wrote:
Your original patch does exactly the same!
Oh, no. You misread my original patch.

Look again.

The logic in my original patch was very different. It said that

- *if* we have a pmd_entry function, then we obviously call that one.

And if - after calling the pmd_entry function - we are still a
hugepage, then we skip the pte_entry case entirely.

And part of skipping is obviously "don't split" - but it never had
that "don't split and then call the pte walker" case.

- and if we *don't* have a pmd_entry function, but we do have a
pte_entry function, then we always split before calling it.

Notice the difference?

From what I can tell, my patch is doing the same. At least that always was the intention. To determine whether to skip pte and skip split, your patch uses

/* No pte level at all? */
if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd)
|| pmd_devmap(*pmd))

whereas my patch does

ÂÂÂ ÂÂÂ ÂÂÂ if (pmd_trans_unstable(pmd))
goto again;
/* Fall through */

which is the same (pmd_trans_unstable() is the same test as you do, but not racy). Yes, it's missing the test for pmd_devmap() but I think that's an mm bug been discussed elsewhere, and we also rerun because a huge / none pmd at this (FALLBACK) point is probably a race and unintended.

But I think the "change pmd_entry to have a sane return code" is a
simpler and more flexible model, and then the pmd_entry code can just
let the pte walker split the pmd if needed.

OK, let's aim for that then.



So I liked that part of your patch.