Re: [PATCH 07/25] mm/csky: Use mm_fault_accounting()
From: Peter Xu
Date: Wed Jun 17 2020 - 11:49:34 EST
On Wed, Jun 17, 2020 at 03:04:49PM +0800, Guo Ren wrote:
> Hi Peter,
Hi, Guo,
>
> On Tue, Jun 16, 2020 at 6:16 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > Use the new mm_fault_accounting() helper for page fault accounting.
> > Also, move the accounting to be after release of mmap_sem.
> >
> > CC: Guo Ren <guoren@xxxxxxxxxx>
> > CC: linux-csky@xxxxxxxxxxxxxxx
> > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> > ---
> > arch/csky/mm/fault.c | 13 +------------
> > 1 file changed, 1 insertion(+), 12 deletions(-)
> >
> > diff --git a/arch/csky/mm/fault.c b/arch/csky/mm/fault.c
> > index 4e6dc68f3258..8f8d34d27eca 100644
> > --- a/arch/csky/mm/fault.c
> > +++ b/arch/csky/mm/fault.c
> > @@ -111,8 +111,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
> > return;
> > }
> > #endif
> > -
> > - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
> > /*
> > * If we're in an interrupt or have no user
> > * context, we must not take the fault..
> > @@ -160,17 +158,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
> > goto bad_area;
> > BUG();
> > }
> > - if (fault & VM_FAULT_MAJOR) {
> > - tsk->maj_flt++;
> > - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
> > - address);
> > - } else {
> > - tsk->min_flt++;
> > - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
> > - address);
> > - }
> > -
> > up_read(&mm->mmap_sem);
> > + mm_fault_accounting(tsk, regs, address, fault & VM_FAULT_MAJOR);
> > return;
> >
> > /*
> > --
> > 2.26.2
> >
> I notice that you move it out of mm->mmap_sem's area, all archs should
> follow the rule ? Can you give me a clarification and put it into de
> commit log ?
I don't think it's a must, but mmap_sem should not be required at least by
observing current code. E.g., do_user_addr_fault() of x86 does the accounting
without mmap_sem even before this series.
The perf events should be thread safe on its own. Frankly speaking I'm not
very certain about my understanding on the per task counters, because iiuc we
can also try to get user pages remotely of a thread in parallel with the page
fault happening on the same thread, then it seems to me that the per task pf
counters can be accessed on different cores simultaneously. However otoh that
seems to be very rare, and it's still acceptable to me as a trade off to avoid
overhead of locks or atomic ops on the counters. I'd be glad to be corrected
if I missed anything important here...
Thanks,
--
Peter Xu