Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression

From: Waiman Long
Date: Tue Nov 03 2020 - 21:46:38 EST

Next message: Chao Yu: "Re: [f2fs-dev] [PATCH RFC] f2fs: fix compat F2FS_IOC_{MOVE, GARBAGE_COLLECT}_RANGE"
Previous message: kernel test robot: "[x86/io_apic] a27dca645d: Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC"
In reply to: Xing Zhengjun: "Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression"
Next in thread: Michal Hocko: "Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/3/20 8:20 PM, Xing Zhengjun wrote:

On 11/2/2020 6:02 PM, Michal Hocko wrote:

On Mon 02-11-20 17:53:14, Rong Chen wrote:

On 11/2/20 5:27 PM, Michal Hocko wrote:

On Mon 02-11-20 17:15:43, kernel test robot wrote:

Greeting,

FYI, we noticed a -22.7% regression of will-it-scale.per_process_ops due to commit:

commit: bd0b230fe14554bfffbae54e19038716f96f5a41 ("mm/memcg: unify swap and memsw page counters")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
I really fail to see how this can be anything else than a data structure
layout change. There is one counter less.

btw. are cgroups configured at all? What would be the configuration?

Hi Michal,

We used the default configure of cgroups, not sure what configuration you
want,
could you give me more details? and here is the cgroup info of will-it-scale
process:

$ cat /proc/3042/cgroup
12:hugetlb:/
11:memory:/system.slice/lkp-bootstrap.service

OK, this means that memory controler is enabled and in use. Btw. do you
get the original performance if you add one phony page_counter after the
union?

I add one phony page_counter after the union and re-test, the regression reduced to -1.2%. It looks like the regression caused by the data structure layout change.

So it looks like the regression is caused by false cacheline sharing of two or more hot items in mem_cgroup. As the size of the page_counter is 112 bytes, eliminating one counter will shift down the cacheline boundary by 16 bytes. We probably need to use perf to find out what those hot items are for this particular benchmark.

Cheers,
Longman

Next message: Chao Yu: "Re: [f2fs-dev] [PATCH RFC] f2fs: fix compat F2FS_IOC_{MOVE, GARBAGE_COLLECT}_RANGE"
Previous message: kernel test robot: "[x86/io_apic] a27dca645d: Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC"
In reply to: Xing Zhengjun: "Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression"
Next in thread: Michal Hocko: "Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]