Re: WARN_ON in move_normal_pmd

From: Joel Fernandes
Date: Fri Mar 24 2023 - 09:43:41 EST


On Fri, Mar 24, 2023 at 9:05 AM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 24, 2023 at 12:15:24PM +0100, Michal Hocko wrote:
> > Hi,
> > our QA is regularly hitting
> > [ 544.198822][T20518] WARNING: CPU: 1 PID: 20518 at ../mm/mremap.c:255 move_pgt_entry+0x4c6/0x510
> > triggered by thp01 LTP test. This has been brought up in the past and
> > resulted in f81fdd0c4ab7 ("mm: document warning in move_normal_pmd() and
> > make it warn only once"). While it is good that the underlying problem
> > is understood, it doesn't seem there is enough interest to fix it
> > properly. Which is fair but I am wondering whether the WARN_ON gives
> > us anything here.
> >
> > Our QA process collects all unexpected side effects of tests and a WARN*
> > in the log is certainly one of those. This trigger bugs which are mostly
> > ignored as there is no upstream fix for them. This alone is nothing
> > really critical but there are workloads which do run with panic on warn
> > configured and this issue would put the system down which is unnecessary
> > IMHO. Would it be sufficient to replace the WARN_ON_ONCE by
> > pr_warn_once?
>
> What about relaxing the check to exclude temporary stack from the WARN
> condition:
>
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 411a85682b58..eb0778b9d645 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -247,15 +247,12 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
> * of any 4kB pages, but still there) PMD in the page table
> * tree.
> *
> - * Warn on it once - because we really should try to figure
> - * out how to do this better - but then say "I won't move
> - * this pmd".
> - *
> - * One alternative might be to just unmap the target pmd at
> - * this point, and verify that it really is empty. We'll see.
> + * Warn on it once unless it is initial (temporary) stack.
> */
> - if (WARN_ON_ONCE(!pmd_none(*new_pmd)))
> + if (!pmd_none(*new_pmd)) {
> + WARN_ON_ONCE(!vma_is_temporary_stack(vma));
> return false;
> + }

Wouldn't it be better to instead fix it from the caller side? Like
making it non-overlapping.

Reading some old threads, I had tried to fix it [1] along these lines
but Linus was rightfully concerned about that fix [2]. Maybe we can
revisit and fix it properly this time.

Personally I feel the safest thing to do is to not do a
non-overlapping mremap and get rid of the warning. Or is there a
better way like unmapping the target from the caller side first,
before the move?

Michal, would you also mind providing the full stack you are seeing
just so we know it is the same issue (i.e. triggered via exec)?

thanks,

- Joel

[1] https://lore.kernel.org/lkml/20200712215041.GA3644504@xxxxxxxxxx/
[2] https://lore.kernel.org/lkml/CAHk-=whxP0Gj70pJN5R7Qec4qjrGr+G9Ex7FJi7=_fPcdQ2ocQ@xxxxxxxxxxxxxx/