Re: WARNING: at mm/mremap.c:211 move_page_tables in i386

From: Joel Fernandes
Date: Sun Jul 12 2020 - 17:50:46 EST


On Thu, Jul 09, 2020 at 10:22:21PM -0700, Linus Torvalds wrote:
> On Thu, Jul 9, 2020 at 9:29 PM Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote:
> >
> > Your patch applied and re-tested.
> > warning triggered 10 times.
> >
> > old: bfe00000-c0000000 new: bfa00000 (val: 7d530067)
>
> Hmm.. It's not even the overlapping case, it's literally just "move
> exactly 2MB of page tables exactly one pmd down". Which should be the
> nice efficient case where we can do it without modifying the lower
> page tables at all, we just move the PMD entry.

Hi Linus,

I reproduced Naresh's issue on a 32-bit x86 machine and the below patch fixes it.
The issue is solely within execve() itself and the way it allocates/copies the
temporary stack.

It is actually indeed an overlapping case because the length of the
stack is big enough to cause overlap. The VMA grows quite a bit because of
all the page faults that happen due to the copy of the args/env. Then during
the move of overlapped region, it finds that a PMD is already allocated.

The below patch fixes it and is not warning anymore in 30 minutes of testing
so far.

Naresh, could you also test the below patch on your setup?

thanks,

- Joel

---8<-----------------------

From: Joel Fernandes <joelaf@xxxxxxxxxx>
Subject: [PATCH] fs/exec: Fix stack overlap issue during stack moving in i386

When running LTP's thp01 test, it is observed that a warning fires in
move_page_tables() because a PMD is already allocated.

This happens because there is an address space overlap between the
temporary stack created and the range it is being moved to when the
move_page_tables() is requested. During the move_page_tables() loop, it
picks the same valid PMD that was already allocated for the temporary
stack. This loop requires the PMD to be new or it warns. Making sure
the new location of the stack is non-overlapping with the old location
makes the warning go away.

Fixes: b6a2fea39318e ("mm: variable length argument support").
Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
---
fs/exec.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/exec.c b/fs/exec.c
index e6e8a9a703278..a270205228a1a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -755,6 +755,10 @@ int setup_arg_pages(struct linux_binprm *bprm,

stack_shift = vma->vm_end - stack_top;

+ /* Ensure the temporary stack is shifted by atleast its size */
+ if (stack_shift < (vma->vm_end - vma->vm_start))
+ stack_shift = (vma->vm_end - vma->vm_start);
+
bprm->p -= stack_shift;
mm->arg_start = bprm->p;
#endif
--
2.27.0.383.g050319c2ae-goog