We obviously want to keep gfs2_glock small, however within reason then yet we can add some additional fields as required. The use case is pretty much a standard LRU list, so items are added and removed, mostly at the active end of the list, and the inactive end of the list is scanned periodically by gfs2_scan_glock_lru()
On 02/01/2018 10:54 AM, Steven Whitehouse wrote:
Hi,
On 31/01/18 23:04, daniel.m.jordan@xxxxxxxxxx wrote:
lru_lock, a per-node* spinlock that protects an LRU list, is one of theGFS2 has an lru list for glocks, which can be contended under certain workloads. Work is still ongoing to figure out exactly why, but this looks like it might be a good approach to that issue too. The main purpose of GFS2's lru list is to allow shrinking of the glocks under memory pressure via the gfs2_scan_glock_lru() function, and it looks like this type of approach could be used there to improve the scalability,
hottest locks in the kernel. On some workloads on large machines, it
shows up at the top of lock_stat.
One way to improve lru_lock scalability is to introduce an array of locks,
with each lock protecting certain batches of LRU pages.
ÂÂÂÂÂÂÂÂ *ooooooooooo**ooooooooooo**ooooooooooo**oooo ...
ÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂ ||ÂÂÂÂÂÂÂÂÂÂ ||ÂÂÂÂÂÂÂÂÂÂ ||
ÂÂÂÂÂÂÂÂÂ \ batch 1 /Â \ batch 2 /Â \ batch 3 /
In this ASCII depiction of an LRU, a page is represented with either '*'
or 'o'. An asterisk indicates a sentinel page, which is a page at the
edge of a batch. An 'o' indicates a non-sentinel page.
To remove a non-sentinel LRU page, only one lock from the array is
required. This allows multiple threads to remove pages from different
batches simultaneously. A sentinel page requires lru_lock in addition to
a lock from the array.
Full performance numbers appear in the last patch in this series, but this
prototype allows a microbenchmark to do up to 28% more page faults per
second with 16 or more concurrent processes.
This work was developed in collaboration with Steve Sistare.
Note: This is an early prototype. I'm submitting it now to support my
request to attend LSF/MM, as well as get early feedback on the idea. Any
comments appreciated.
* lru_lock is actually per-memcg, but without memcg's in the picture it
ÂÂ becomes per-node.
Glad to hear that this could help in gfs2 as well.
Hopefully struct gfs2_glock is less space constrained than struct page for storing the few bits of metadata that this approach requires.
Daniel