Re: [PATCH v4 0/9] per lruvec lru_lock for memcg

From: Konstantin Khlebnikov
Date: Sun Nov 24 2019 - 10:49:47 EST


On 19/11/2019 15.23, Alex Shi wrote:
Hi all,

This patchset move lru_lock into lruvec, give a lru_lock for each of
lruvec, thus bring a lru_lock for each of memcg per node.

According to Daniel Jordan's suggestion, I run 64 'dd' with on 32
containers on my 2s* 8 core * HT box with the modefied case:
https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice

With this change above lru_lock censitive testing improved 17% with multiple
containers scenario. And no performance lose w/o mem_cgroup.

Splitting lru_lock isn't only option for solving this lock contention.
Also it doesn't help if all this happens in one cgroup.

I think better batching could solve more problems with less overhead.

Like larger per-cpu vectors or queues for each numa node or even for each lruvec.
This will preliminarily sort and aggregate pages so actual modification under
lru_lock will be much cheaper and fine grained.


Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
7 years ago. Now I believe considering my testing result, and google internal
using fact. This feature is clearly benefit multi-container users.

So I'd like to introduce it here.

Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan,
Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.

v4:
a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
b, remove the irqsave flags changes, thanks Metthew Wilcox
c, merge/split patches for better understanding and bisection purpose

v3: rebase on linux-next, and fold the relock fix patch into introduceing patch

v2: bypass a performance regression bug and fix some function issues

v1: initial version, aim testing show 5% performance increase


Alex Shi (9):
mm/swap: fix uninitialized compiler warning
mm/huge_memory: fix uninitialized compiler warning
mm/lru: replace pgdat lru_lock with lruvec lock
mm/mlock: only change the lru_lock iff page's lruvec is different
mm/swap: only change the lru_lock iff page's lruvec is different
mm/vmscan: only change the lru_lock iff page's lruvec is different
mm/pgdat: remove pgdat lru_lock
mm/lru: likely enhancement
mm/lru: revise the comments of lru_lock

Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +----
Documentation/admin-guide/cgroup-v1/memory.rst | 6 +-
Documentation/trace/events-kmem.rst | 2 +-
Documentation/vm/unevictable-lru.rst | 22 +++----
include/linux/memcontrol.h | 68 ++++++++++++++++++++
include/linux/mm_types.h | 2 +-
include/linux/mmzone.h | 5 +-
mm/compaction.c | 67 +++++++++++++------
mm/filemap.c | 4 +-
mm/huge_memory.c | 17 ++---
mm/memcontrol.c | 75 +++++++++++++++++-----
mm/mlock.c | 27 ++++----
mm/mmzone.c | 1 +
mm/page_alloc.c | 1 -
mm/page_idle.c | 5 +-
mm/rmap.c | 2 +-
mm/swap.c | 74 +++++++++------------
mm/vmscan.c | 74 ++++++++++-----------
18 files changed, 287 insertions(+), 180 deletions(-)