Re: a racy access flag clearing warning when calling mmap system call

From: Will Deacon
Date: Thu Dec 07 2017 - 08:23:03 EST


On Thu, Dec 07, 2017 at 09:46:59AM +0800, Yisheng Xie wrote:
> On 2017/12/1 21:18, Will Deacon wrote:
> > On Fri, Dec 01, 2017 at 03:38:04PM +0800, chenjiankang wrote:
> >> ------------[ cut here ]------------
> >> WARNING: at ../../../../../kernel/linux-4.1/arch/arm64/include/asm/pgtable.h:211
> >
> > Given that this is a fairly old 4.1 kernel, could you try to reproduce the
> > failure with something more recent, please? We've fixed many bugs since
> > then, some of them involving huge pages.
>
> Yeah, this is and old kernel, but I find a scene that will cause this warn_on:
> When fork and dup_mmap, it will call copy_huge_pmd() and clear the Access Flag.
> dup_mmap
> -> copy_page_range
> -> copy_pud_range
> -> copy_pmd_range
> -> copy_huge_pmd
> -> pmd_mkold
>
> If we do not have any access after dup_mmap, and start to split this thp,
> it will cause this call trace in the old kernel, right?
>
> It seems this is normal scene but will cause call trace for this old kernel,
> therefore, for this old kernel, we should just remove this WARN_ON_ONCE, right?

Whilst racy clearing of the access flag should be safe in practice, I like
having the warning around because it does indicate that we're setting
something to old which could immediately be made young again by the CPU.

In this case, it looks like the mm isn't even live, so a better approach
would probably be to predicate that conditional on mm == current->active_mm
or something like that. That also avoids us getting false positive for
the dirty bit case, which would be harmful if the table was installed.

diff below. It's still racy with concurrent fork, but I don't want this
check to become a generic "does my caller hold all the locks to protect
against a concurrent walk" predicate and it just means we won't catch all
possible races.

Will

--->8

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 149d05fb9421..8fe103b1e101 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -42,6 +42,8 @@
#include <asm/cmpxchg.h>
#include <asm/fixmap.h>
#include <linux/mmdebug.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>

extern void __pte_error(const char *file, int line, unsigned long val);
extern void __pmd_error(const char *file, int line, unsigned long val);
@@ -207,9 +209,6 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
}
}

-struct mm_struct;
-struct vm_area_struct;
-
extern void __sync_icache_dcache(pte_t pteval, unsigned long addr);

/*
@@ -238,7 +237,8 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
* hardware updates of the pte (ptep_set_access_flags safely changes
* valid ptes without going through an invalid entry).
*/
- if (pte_valid(*ptep) && pte_valid(pte)) {
+ if (IS_ENABLED(CONFIG_DEBUG_VM) && pte_valid(*ptep) && pte_valid(pte) &&
+ (mm == current->active_mm || atomic_read(&mm->mm_users) > 1)) {
VM_WARN_ONCE(!pte_young(pte),
"%s: racy access flag clearing: 0x%016llx -> 0x%016llx",
__func__, pte_val(*ptep), pte_val(pte));