Re: sparc64 WARNING: at mm/mmap.c:2757 exit_mmap+0x13c/0x160()

From: David Miller
Date: Tue Jul 29 2014 - 19:26:44 EST

From: mroos@xxxxxxxx
Date: Thu, 17 Apr 2014 01:22:17 +0300 (EEST)

>> > Just for the archives, I got one of these again with 3.14:
>> Meelis and Aaro, thanks again for all of your reports.
>> After pouring over a lot of the data and auditing some code I'm
>> suspecting it's a problem with transparent huge pages.
>> One thing you two can do to help me further confirm this is to run
>> with THP disabled for a while and see if you still get the log
>> messages.
> I have snice turned off CONFIG_TRANSPARENT_HUGEPAGE on 3 of 4 servers
> that had this problem (actually most of my sparc64 machines) and the 4th
> has
> # CONFIG_HUGETLBFS is not set
> and also has not had this problem since then. All 4 machines have been
> running through most -rc's of every kernel.

Here is something I'd like you guys to test.

Yesterday, Christopher (CC:'d), posted some fixes yesterday and one of
them is very interesting.

Basically the update_mmu_cache() methods on sparc64 can insert an
invalid PTE into the TSB hash tables, causing livelocks and other
annoying issues.

The path where this can happen is via remove_migration_pte().

I had a discussion with Johannes Weiner about this and we determined
that it would make sense to mis-diagnose THP as being the root cause
in the RSS counter et al. problems if this bug here is the real
reason those things are happening.

That's because if you're not using THP there is less compaction going
on. Less compaction means less migration, and therefore a lower
likelyhood of this code path triggering like this.

Could you guys please try this patch below? Thanks.

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 16b58ff..8e894e0 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -351,6 +351,10 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *

mm = vma->vm_mm;

+ /* Don't insert a non-valid PTE into the TSB, we'll deadlock. */
+ if (!pte_accessible(mm, pte))
+ return;
spin_lock_irqsave(&mm->context.lock, flags);

@@ -2617,6 +2621,10 @@ void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
if (!pmd_large(entry) || !pmd_young(entry))

+ /* Don't insert a non-valid PMD into the TSB, we'll deadlock. */
+ if (!(pte & _PAGE_VALID))
+ return;
pte = pmd_val(entry);

/* We are fabricating 8MB pages using 4MB real hw pages. */
