Re: [PATCH v3 1/6] mm/mremap: Optimize the start addresses in move_page_tables()

From: Linus Torvalds
Date: Wed May 24 2023 - 19:23:33 EST

Hmm. I'm still quite unhappy about your can_align_down().

On Wed, May 24, 2023 at 8:32 AM Joel Fernandes (Google)
<joel@xxxxxxxxxxxxxxxxx> wrote:
> + /* If the masked address is within vma, we cannot align the address down. */
> + if (vma->vm_start <= addr_masked)
> + return false;

I don't think this test is right.

The test should not be "is the mapping still there at the point we
aligned down to".

No, the test should be whether there is any part of the mapping below
the point we're starting with:

if (vma->vm_start < addr_to_align)
return false;

because we can do the "expand the move down" *only* if it's the
beginning of the vma (because otherwise we'd be moving part of the vma
that precedes the address!)

(Alternatively, just make that "<" be "!=" - we're basically saying
that we can expand moving ptes to a pmd boundary *only* if this vma
starts at that point. No?).

> + cur = find_vma_prev(vma->vm_mm, vma->vm_start, &prev);
> + if (!cur || cur != vma || !prev)
> + return false;

I've mentioned this test before, and I still find it actively misleading.

First off, the "!cur || cur != vma" test is clearly redundant. We know
'vma' isn't NULL (we just dereferenced it!). So "cur != vma" already
includes the "!cur" test.

So that "!cur" part of the test simply *cannot* be sensible.

And the "!prev" test still makes no sense to me. You tried to explain
it to me earlier, and I clearly didn't get it. It seems actively
wrong. I still think "!prev" should return true.

You seemed to think that "!prev" couldn';t actually happen and would
be a sign of some VM problem, but that doesn't make any sense to me.
Of course !prev can happen - if "vma" is the first vma in the VM and
there is no previous.

It may be *rare*, but I still don't understand why you'd make that
"there is no vma below us" mean "we cannot expand the move below us
because there's something there".

So I continue to think that this test should just be

if (WARN_ON_ONCE(cur != vma))
return false;

because if it ever returns something that *isn't* the same as vma,
then we do indeed have serious problems. But that WARN_ON_ONCE() shows
that that's a "cannot happen" thing, not some kind of "if this happens
than don't do it" test.

and then the *real* test for "can we align down" should just be

return !prev || prev->vm_end <= addr_masked;

Because while I think your code _works_, it really doesn't seem to
make much sense as it stands in your patch. The tests are actively
misleading. No?