Re: [PATCH v12 00/31] Speculative page faults

From: Joel Fernandes
Date: Mon Dec 14 2020 - 13:11:27 EST


On Mon, Dec 14, 2020 at 10:36:29AM +0100, Laurent Dufour wrote:
> Le 14/12/2020 à 03:03, Joel Fernandes a écrit :
> > On Tue, Jul 07, 2020 at 01:31:37PM +0800, Chinwen Chang wrote:
> > [..]
> > > > > Hi Laurent,
> > > > >
> > > > > We merged SPF v11 and some patches from v12 into our platforms. After
> > > > > several experiments, we observed SPF has obvious improvements on the
> > > > > launch time of applications, especially for those high-TLP ones,
> > > > >
> > > > > # launch time of applications(s):
> > > > >
> > > > > package version w/ SPF w/o SPF improve(%)
> > > > > ------------------------------------------------------------------
> > > > > Baidu maps 10.13.3 0.887 0.98 9.49
> > > > > Taobao 8.4.0.35 1.227 1.293 5.10
> > > > > Meituan 9.12.401 1.107 1.543 28.26
> > > > > WeChat 7.0.3 2.353 2.68 12.20
> > > > > Honor of Kings 1.43.1.6 6.63 6.713 1.24
> > > >
> > > > That's great news, thanks for reporting this!
> > > >
> > > > >
> > > > > By the way, we have verified our platforms with those patches and
> > > > > achieved the goal of mass production.
> > > >
> > > > Another good news!
> > > > For my information, what is your targeted hardware?
> > > >
> > > > Cheers,
> > > > Laurent.
> > >
> > > Hi Laurent,
> > >
> > > Our targeted hardware belongs to ARM64 multi-core series.
> >
> > Hello!
> >
> > I was trying to develop an intuition about why does SPF give improvement for
> > you on small CPU systems. This is just a high-level theory but:
> >
> > 1. Assume the improvement is because of elimination of "blocking" on
> > mmap_sem.
> > Could it be that the mmap_sem is acquired in write-mode unnecessarily in some
> > places, thus causing blocking on mmap_sem in other paths? If so, is it
> > feasible to convert such usages to acquiring them in read-mode?
>
> That's correct, and the goal of this series is to try not holding the
> mmap_sem in read mode during page fault processing.
>
> Converting mmap_sem holder from write to read mode is not so easy and that
> work as already been done in some places. If you think there are areas where
> this could be done, you're welcome to send patches fixing that.
>
> > 2. Assume the improvement is because of lesser read-side contention on
> > mmap_sem.
> > On small CPU systems, I would not expect reducing cache-line bouncing to give
> > such a dramatic improvement in performance as you are seeing.
>
> I don't think cache line bouncing reduction is the main sourcec of
> performance improvement, I would rather think this is the lower part here.
> I guess this is mainly because during loading time a lot of page fault is
> occuring and thus SPF is reducing the contention on the mmap_sem.

Thanks for the reply. I think I also wrongly assumed that acquiring mmap
rwsem in write mode in a syscall makes SPF moot. Peter explained to me on IRC
that tere's still perf improvement in write mode if an unrelated VMA is
modified while another VMA is faulting. CMIIW - not an mm expert by any
stretch.

Thanks!

- Joel