RE: [PATCH v10 2/3] arm64: mm: implement arch_faults_on_old_pte() on arm64

From: Justin He (Arm Technology China)
Date: Mon Oct 07 2019 - 21:55:36 EST


Hi Will and Marc

> -----Original Message-----
> From: Marc Zyngier <maz@xxxxxxxxxx>
> Sent: 2019年10月1日 21:32
> To: Will Deacon <will@xxxxxxxxxx>
> Cc: Justin He (Arm Technology China) <Justin.He@xxxxxxx>; Catalin
> Marinas <Catalin.Marinas@xxxxxxx>; Mark Rutland
> <Mark.Rutland@xxxxxxx>; James Morse <James.Morse@xxxxxxx>;
> Matthew Wilcox <willy@xxxxxxxxxxxxx>; Kirill A. Shutemov
> <kirill.shutemov@xxxxxxxxxxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Punit Agrawal
> <punitagrawal@xxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>;
> Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; hejianet@xxxxxxxxx; Kaly
> Xin (Arm Technology China) <Kaly.Xin@xxxxxxx>
> Subject: Re: [PATCH v10 2/3] arm64: mm: implement
> arch_faults_on_old_pte() on arm64
>
> On Tue, 1 Oct 2019 13:50:32 +0100
> Will Deacon <will@xxxxxxxxxx> wrote:
>
> > On Mon, Sep 30, 2019 at 09:57:39AM +0800, Jia He wrote:
> > > On arm64 without hardware Access Flag, copying fromuser will fail
> because
> > > the pte is old and cannot be marked young. So we always end up with
> zeroed
> > > page after fork() + CoW for pfn mappings. we don't always have a
> > > hardware-managed access flag on arm64.
> > >
> > > Hence implement arch_faults_on_old_pte on arm64 to indicate that it
> might
> > > cause page fault when accessing old pte.
> > >
> > > Signed-off-by: Jia He <justin.he@xxxxxxx>
> > > Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > > ---
> > > arch/arm64/include/asm/pgtable.h | 14 ++++++++++++++
> > > 1 file changed, 14 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/pgtable.h
> b/arch/arm64/include/asm/pgtable.h
> > > index 7576df00eb50..e96fb82f62de 100644
> > > --- a/arch/arm64/include/asm/pgtable.h
> > > +++ b/arch/arm64/include/asm/pgtable.h
> > > @@ -885,6 +885,20 @@ static inline void update_mmu_cache(struct
> vm_area_struct *vma,
> > > #define phys_to_ttbr(addr) (addr)
> > > #endif
> > >
> > > +/*
> > > + * On arm64 without hardware Access Flag, copying from user will fail
> because
> > > + * the pte is old and cannot be marked young. So we always end up
> with zeroed
> > > + * page after fork() + CoW for pfn mappings. We don't always have a
> > > + * hardware-managed access flag on arm64.
> > > + */
> > > +static inline bool arch_faults_on_old_pte(void)
> > > +{
> > > + WARN_ON(preemptible());
> > > +
> > > + return !cpu_has_hw_af();
> > > +}
> >
> > Does this work correctly in a KVM guest? (i.e. is the MMFR sanitised in
> that
> > case, despite not being the case on the host?)
>
> Yup, all the 64bit MMFRs are trapped (HCR_EL2.TID3 is set for an
> AArch64 guest), and we return the sanitised version.
Thanks for Marc's explanation. I verified the patch series on a kvm guest (-M virt)
with simulated nvdimm device created by qemu. The host is ThunderX2 aarch64.

>
> But that's an interesting remark: we're now trading an extra fault on
> CPUs that do not support HWAFDBS for a guaranteed trap for each and
> every guest under the sun that will hit the COW path...
>
> My gut feeling is that this is going to be pretty visible. Jia, do you
> have any numbers for this kind of behaviour?
It is not a common COW path, but a COW for PFN mapping pages only.
I add a g_counter before pte_mkyoung in force_mkyoung{} when testing
vmmalloc_fork at [1].

In this test case, it will start M fork processes and N pthreads. The default is
M=2,N=4. the g_counter is about 241, that is it will hit my patch series for 241
times.
If I set M=20 and N=40 for TEST3, the g_counter is about 1492.

[1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork


--
Cheers,
Justin (Jia He)