Re: mm, something wrong in page_lock_anon_vma_read()?
From: Andrea Arcangeli
Date: Thu Jul 20 2017 - 08:58:46 EST
On Wed, Jul 19, 2017 at 05:59:01PM +0800, Xishi Qiu wrote:
> I find two patches from upstream.
> 887843961c4b4681ee993c36d4997bf4b4aa8253
Do you use the remap_file_pages syscall? Such syscall has been dropped
upstream so very few apps should possibly try to use it on 64bit
archs.
It would also require a get_user_pages(write=1, force=1) on a nonlinear
VM_SHARED mapped without PROT_WRITE and such action should happen
before remap_file_pages is called to overwrite the page that got poked
by gdb.
Which sounds an extremely unusual setup for a production
environment. Said that you're clearly running docker containers so who
knows what is running inside them (and the point where you notice the
stale anon-vma and the container that crashes isn't necessarily the
same container that runs the fremap readonly gdb poking workload).
I'll look into integrating the above fix regardless.
I'll also send you privately the fix backported to the specific
enterprise kernel you're using, adding a WARN_ON as well that will
tell us if such a fix ever makes a difference. The alternative is that
you place a perf probe or systemtap hook in remap_file_pages to know
if it ever runs, but the WARN_ON I'll add is even better proof. If you
get the WARN_ON in the logs, we'll be 100% sure thing the patch fixed
your issue and we don't have to keep looking for other issues of the
same kind.
> a9c8e4beeeb64c22b84c803747487857fe424b68
>
> I can't find any relations to the panic from the first one, and the second
Actually I do. Vlastimil theory that a pte got marked none is sound
but if zap_pte in a fremap fails to drop the anon page that was under
memory migration/compaction the exact same thing will happen. Either
ways an anon page isn't freed as it should have been: the vma will be
dropped, the anon-vma too, but the page will be left hanging around as
anonymous in the lrus with page->mapping pointing to a stale anon_vma
and the rss counters will go off by one too.
> one seems triggered from xen, but we use kvm.
Correct, the second one isn't needed with KVM.
Thanks,
Andrea