Re: linux-next: BUG: Bad page state in process ip6tables-save pfn:1499f4

From: Kirill A. Shutemov
Date: Tue Jun 27 2017 - 12:38:59 EST


On Tue, Jun 27, 2017 at 09:18:15AM +0200, Vlastimil Babka wrote:
> On 06/24/2017 05:08 PM, Andrei Vagin wrote:
> > On Fri, Jun 23, 2017 at 05:17:44PM -0700, Andrei Vagin wrote:
> >> On Thu, Jun 22, 2017 at 11:21:03PM -0700, Andrei Vagin wrote:
> >>> Hello,
> >>>
> >>> We run CRIU tests for linux-next and today they triggered a kernel
> >>> bug. I want to mention that this kernel is built with kasan. This bug
> >>> was triggered in travis-ci. I can't reproduce it on my host. Without
> >>> kasan, kernel crashed but it is impossible to get a kernel log for
> >>> this case.
> >>
> >> We use this tree
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/
> >>
> >> This issue isn't reproduced on the akpm-base branch and
> >> it is reproduced each time on the akpm branch. I didn't
> >> have time today to bisect it, will do on Monday.
> >
> > c3aab7b2d4e8434d53bc81770442c14ccf0794a8 is the first bad commit
> >
> > commit c3aab7b2d4e8434d53bc81770442c14ccf0794a8
> > Merge: 849c34f 93a7379
> > Author: Stephen Rothwell
> > Date: Fri Jun 23 16:40:07 2017 +1000
> >
> > Merge branch 'akpm-current/current'
>
> Hm is it really the merge of mmotm itself and not one of the patches in
> mmotm?
> Anyway smells like THP, adding Kirill.

Okay, it took a while to figure it out.

The bug is in patch "mm, gup: ensure real head page is ref-counted when
using hugepages". We should look for a head *before* the loop. Otherwise
'page' may point to the first page beyond the compound page.

The patch below should help.

If no objections, Andrew, could you fold it into the problematic patch?

diff --git a/mm/gup.c b/mm/gup.c
index d8db6e5016a8..6f9ca86b3d03 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1424,6 +1424,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,

refs = 0;
page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+ head = compound_head(page);
do {
pages[*nr] = page;
(*nr)++;
@@ -1431,7 +1432,6 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);

- head = compound_head(page);
if (!page_cache_add_speculative(head, refs)) {
*nr -= refs;
return 0;
@@ -1462,6 +1462,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,

refs = 0;
page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+ head = compound_head(page);
do {
pages[*nr] = page;
(*nr)++;
@@ -1469,7 +1470,6 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);

- head = compound_head(page);
if (!page_cache_add_speculative(head, refs)) {
*nr -= refs;
return 0;
@@ -1499,6 +1499,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
BUILD_BUG_ON(pgd_devmap(orig));
refs = 0;
page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT);
+ head = compound_head(page);
do {
pages[*nr] = page;
(*nr)++;
@@ -1506,7 +1507,6 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);

- head = compound_head(page);
if (!page_cache_add_speculative(head, refs)) {
*nr -= refs;
return 0;
--
Kirill A. Shutemov