Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression
From: Michal Hocko
Date: Thu Nov 12 2020 - 09:17:01 EST
On Thu 12-11-20 20:28:44, Feng Tang wrote:
> Hi Michal,
>
> On Wed, Nov 04, 2020 at 09:15:46AM +0100, Michal Hocko wrote:
> > > > > Hi Michal,
> > > > >
> > > > > We used the default configure of cgroups, not sure what configuration you
> > > > > want,
> > > > > could you give me more details? and here is the cgroup info of will-it-scale
> > > > > process:
> > > > >
> > > > > $ cat /proc/3042/cgroup
> > > > > 12:hugetlb:/
> > > > > 11:memory:/system.slice/lkp-bootstrap.service
> > > >
> > > > OK, this means that memory controler is enabled and in use. Btw. do you
> > > > get the original performance if you add one phony page_counter after the
> > > > union?
> > > >
> > > I add one phony page_counter after the union and re-test, the regression
> > > reduced to -1.2%. It looks like the regression caused by the data structure
> > > layout change.
> >
> > Thanks for double checking. Could you try to cache align the
> > page_counter struct? If that helps then we should figure which counters
> > acks against each other by adding the alignement between the respective
> > counters.
>
> We tried below patch to make the 'page_counter' aligned.
>
> diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h
> index bab7e57..9efa6f7 100644
> --- a/include/linux/page_counter.h
> +++ b/include/linux/page_counter.h
> @@ -26,7 +26,7 @@ struct page_counter {
> /* legacy */
> unsigned long watermark;
> unsigned long failcnt;
> -};
> +} ____cacheline_internodealigned_in_smp;
>
> and with it, the -22.7% peformance change turns to a small -1.7%, which
> confirms the performance bump is caused by the change to data alignment.
>
> After the patch, size of 'page_counter' increases from 104 bytes to 128
> bytes, and the size of 'mem_cgroup' increases from 2880 bytes to 3008
> bytes(with our kernel config). Another major data structure which
> contains 'page_counter' is 'hugetlb_cgroup', whose size will change
> from 912B to 1024B.
>
> Should we make these page_counters aligned to reduce cacheline conflict?
I would rather focus on a more effective mem_cgroup layout. It is very
likely that we are just stumbling over two counters here.
Could you try to add cache alignment of counters after memory and see
which one makes the difference? I do not expect memsw to be the one
because that one is used together with the main counter. But who knows
maybe the way it crosses the cache line has the exact effect. Hard to
tell without other numbers.
Btw. it would be great to see what the effect is on cgroup v2 as well.
Thanks for pursuing this!
--
Michal Hocko
SUSE Labs