Re: [PATCH v10 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

From: Will Deacon
Date: Tue Oct 08 2019 - 08:39:52 EST


On Tue, Oct 08, 2019 at 02:19:05AM +0000, Justin He (Arm Technology China) wrote:
> > -----Original Message-----
> > From: Will Deacon <will@xxxxxxxxxx>
> > Sent: 2019å10æ1æ 20:54
> > To: Justin He (Arm Technology China) <Justin.He@xxxxxxx>
> > Cc: Catalin Marinas <Catalin.Marinas@xxxxxxx>; Mark Rutland
> > <Mark.Rutland@xxxxxxx>; James Morse <James.Morse@xxxxxxx>; Marc
> > Zyngier <maz@xxxxxxxxxx>; Matthew Wilcox <willy@xxxxxxxxxxxxx>; Kirill A.
> > Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>; linux-arm-
> > kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> > mm@xxxxxxxxx; Punit Agrawal <punitagrawal@xxxxxxxxx>; Thomas
> > Gleixner <tglx@xxxxxxxxxxxxx>; Andrew Morton <akpm@linux-
> > foundation.org>; hejianet@xxxxxxxxx; Kaly Xin (Arm Technology China)
> > <Kaly.Xin@xxxxxxx>
> > Subject: Re: [PATCH v10 3/3] mm: fix double page fault on arm64 if PTE_AF
> > is cleared
> >
> > On Mon, Sep 30, 2019 at 09:57:40AM +0800, Jia He wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index b1ca51a079f2..1f56b0118ef5 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -118,6 +118,13 @@ int randomize_va_space __read_mostly =
> > > 2;
> > > #endif
> > >
> > > +#ifndef arch_faults_on_old_pte
> > > +static inline bool arch_faults_on_old_pte(void)
> > > +{
> > > + return false;
> > > +}
> > > +#endif
> >
> > Kirill has acked this, so I'm happy to take the patch as-is, however isn't
> > it the case that /most/ architectures will want to return true for
> > arch_faults_on_old_pte()? In which case, wouldn't it make more sense for
> > that to be the default, and have x86 and arm64 provide an override? For
> > example, aren't most architectures still going to hit the double fault
> > scenario even with your patch applied?
>
> No, after applying my patch series, only those architectures which don't provide
> setting access flag by hardware AND don't implement their arch_faults_on_old_pte
> will hit the double page fault.
>
> The meaning of true for arch_faults_on_old_pte() is "this arch doesn't have the hardware
> setting access flag way, it might cause page fault on an old pte"
> I don't want to change other architectures' default behavior here. So by default,
> arch_faults_on_old_pte() is false.

...and my complaint is that this is the majority of supported architectures,
so you're fixing something for arm64 which also affects arm, powerpc,
alpha, mips, riscv, ...

Chances are, they won't even realise they need to implement
arch_faults_on_old_pte() until somebody runs into the double fault and
wastes lots of time debugging it before they spot your patch.

> Btw, currently I only observed this double pagefault on arm64's guest
> (host is ThunderX2). On X86 guest (host is Intel(R) Core(TM) i7-4790 CPU
> @ 3.60GHz ), there is no such double pagefault. It has the similar setting
> access flag way by hardware.

Right, and that's why I'm not concerned about x86 for this problem.

Will