Re: [PATCH 1/1] mm:improve the performance during fork

From: jun qian
Date: Tue Dec 22 2020 - 10:33:57 EST


Souptick Joarder <jrdr.linux@xxxxxxxxx> 于2020年12月22日周二 下午11:08写道:
>
> On Tue, Dec 22, 2020 at 5:49 PM <qianjun.kernel@xxxxxxxxx> wrote:
> >
> > From: jun qian <qianjun.kernel@xxxxxxxxx>
> >
> > In our project, Many business delays come from fork, so
> > we started looking for the reason why fork is time-consuming.
> > I used the ftrace with function_graph to trace the fork, found
> > that the vm_normal_page will be called tens of thousands and
> > the execution time of this vm_normal_page function is only a
> > few nanoseconds. And the vm_normal_page is not a inline function.
> > So I think if the function is inline style, it maybe reduce the
> > call time overhead.
> >
> > I did the following experiment:
> >
> > I have wrote the c test code, pls ignore the memory leak :)
> > Before fork, I will malloc 4G bytes, then acculate the fork
> > time.
> >
> > int main()
> > {
> > char *p;
> > unsigned long long i=0;
> > float time_use=0;
> > struct timeval start;
> > struct timeval end;
> >
> > for(i=0; i<LEN; i++) {
> > p = (char *)malloc(4096);
> > if (p == NULL) {
> > printf("malloc failed!\n");
> > return 0;
> > }
> > p[0] = 0x55;
> > }
> > gettimeofday(&start,NULL);
> > fork();
> > gettimeofday(&end,NULL);
> >
> > time_use=(end.tv_sec * 1000000 + end.tv_usec) -
> > (start.tv_sec * 1000000 + start.tv_usec);
> > printf("time_use is %.10f us\n",time_use);
> >
> > return 0;
> > }
> >
> > We need to compare the changes in the size of vmlinux, the time of
> > fork in inline and non-inline cases, and the vm_normal_page will be
> > called in many function. So we also need to compare this function's
> > size. For examples, the do_wp_page will call vm_normal_page, so I
> > also calculated it's size.
> >
> > inline non-inline diff
> > vmlinux size 9709248 bytes 9709824 bytes -576 bytes
> > fork time 23475ns 24638ns -4.7%
>
> Do you have time diff for both parent and child process ?

yes, the child time diff and the parent time diff are almost same,
just like this, a.out is the test program.

./a.out
time_use is 23342.0000000000 us
time_use is 23404.0000000000 us

>
> > do_wp_page size 972 743 +229
> >
> > According to the above test data, I think inline vm_normal_page can
> > reduce fork execution time.
> >
> > Signed-off-by: jun qian <qianjun.kernel@xxxxxxxxx>
> > ---
> > mm/memory.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 7d608765932b..a689bb5d3842 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -591,7 +591,7 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
> > * PFNMAP mappings in order to support COWable mappings.
> > *
> > */
> > -struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> > +inline struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> > pte_t pte)
> > {
> > unsigned long pfn = pte_pfn(pte);
> > --
> > 2.18.2
> >
> >