On Thu, Nov 14, 2024 at 04:39:00PM -0500, Sid Kumar wrote:Hello,
[...]
On 11/14/24 12:05 PM, Sidhartha Kumar wrote:
================ results =========================
Bpftrace was used to profile the allocation path for requesting new maple
nodes while running the ./mmap1_processes test from mmtests. The two paths
for allocation are requests for a single node and the bulk allocation path.
The histogram represents the number of calls to these paths and a shows the
distribution of the number of nodes requested for the bulk allocation path.
mm-unstable 11/13/24
@bulk_alloc_req:
[2, 4) 10 |@@@@@@@@@@@@@ |
[4, 8) 38 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[8, 16) 19 |@@@@@@@@@@@@@@@@@@@@@@@@@@ |
mm-unstable 11/13/24 + this series
@bulk_alloc_req:
[2, 4) 9 |@@@@@@@@@@ |
[4, 8) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[8, 16) 15 |@@@@@@@@@@@@@@@@@@ |
We can see the worst case bulk allocations of [8,16) nodes are reduced after
this series.
From running the ./malloc1_threads test case we eliminate almost all bulk
allocation requests that
fall between 8 and 16 nodes
./malloc1_threads -t 8 -s 100
mm-unstable + this series
@bulk_alloc_req:
[2, 4) 2 |
|
[4, 8) 3381
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[8, 16) 2 |
|
This is impressive. But I come up one thing not clear.
For mmap related code, we usually have the following usage:
vma_iter_prealloc(vmi, vma);
mas_preallocate(vmi->mas, vma);
MA_WR_STATE(wr_mas, );
mas_wr_store_type(&wr_mas); --- (1)
vma_iter_store(vmi, vma);
Locaton (1) is where we try to get a better estimation of allocations.
The estimation is based on we walk down the tree and try to meet a proper
node.
In mmap related code, we usually have already walked down the
tree to leaf, by vma_find() or related iteration operation, and the mas.status
is set to ma_active. To me, I don't expect mas_preallocate() would traverse
the tree again.
But from your result, it seems most cases do traverse the tree again to get a
more precise height.
Which part do you think I have missed?
mm-unstable
@bulk_alloc_req:
[2, 4) 1 |
|
[4, 8) 1427 |@@@@@@@@@@@@@@@@@@@@@@@@@@
|
[8, 16) 2790
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
Sidhartha Kumar (5):
maple_tree: convert mas_prealloc_calc() to take in a maple write state
maple_tree: use height and depth consistently
maple_tree: use vacant nodes to reduce worst case allocations
maple_tree: break on convergence in mas_spanning_rebalance()
maple_tree: add sufficient height
include/linux/maple_tree.h | 4 +
lib/maple_tree.c | 89 +++++++++++++---------
tools/testing/radix-tree/maple.c | 125 +++++++++++++++++++++++++++++--
3 files changed, 176 insertions(+), 42 deletions(-)