Re: mm, something wrong in page_lock_anon_vma_read()?
From: Andrea Arcangeli
Date: Thu Jul 20 2017 - 12:15:56 EST
On Thu, Jul 20, 2017 at 02:58:35PM +0200, Andrea Arcangeli wrote:
> but if zap_pte in a fremap fails to drop the anon page that was under
> memory migration/compaction the exact same thing will happen. Either
... except it is ok to clear a migration entry, it will be migration
that will free the new page after migration completes, zap_pte doesn't
need to wait. So this fix is good, but I was too optimistic about its
ability to explain the whole problem. It only can explain Rss cosmetic
errors, not a anon page left hanging around after its anon vma has
been freed.
About the theory this could be THP related, the Rss stats being off by
one as symptom of the bug, don't seem to point in that direction, all
complex THP operations don't mess with the rss or they tend to act in
blocks of 512. Furthermore the BZ already confirmed it can be
reproduced with THP disabled. Said that it also was supposedly already
fixed by the various patches you manually backported to your build.
I believe for fairness (mailing list traffic etc..) it's be preferable
to continue the debugging in the open BZ and not here because you
didn't reproduce it on a upstraem kernel yet so we cannot be 100% sure
if upstream (if only -stable) could reproduce it.
Thanks,
Andrea