Re: [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd

From: Linus Torvalds
Date: Sat Mar 07 2015 - 15:31:12 EST


On Sat, Mar 7, 2015 at 7:20 AM, Mel Gorman <mgorman@xxxxxxx> wrote:
>
> if (!prot_numa || !pmd_protnone(*pmd)) {
> - ret = 1;
> entry = pmdp_get_and_clear_notify(mm, addr, pmd);
> entry = pmd_modify(entry, newprot);
> ret = HPAGE_PMD_NR;

Hmm. I know I acked this already, but the return value - which correct
- is still potentially something we could improve upon.

In particular, we don't need to flush the TLB's if the old entry was
not present. Sadly, we don't have a helper function for that.

But the code *could* do something like

entry = pmdp_get_and_clear_notify(mm, addr, pmd);
ret = pmd_tlb_cacheable(entry) ? HPAGE_PMD_NR : 1;
entry = pmd_modify(entry, newprot);

where pmd_tlb_cacheable() on x86 would test if _PAGE_PRESENT (bit #0) is set.

In particular, that would mean that as we change *from* a protnone
(whether NUMA or really protnone) we wouldn't need to flush the TLB.

In fact, we could make it even more aggressive: it's not just an old
non-present TLB entry that doesn't need flushing - we can avoid the
flushing whenever we strictly increase the access rigths. So we could
have something that takes the old entry _and_ the new protections into
account, and avoids the TLB flush if the new entry is strictly more
permissive.

This doesn't explain the extra TLB flushes Dave sees, though, because
the old code didn't make those kinds of optimizations either. But
maybe something like this is worth doing.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/