Re: [PATCH v7 00/70] Introducing the Maple Tree
From: Liam Howlett
Date: Thu Apr 14 2022 - 13:02:16 EST
* Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> [220414 09:57]:
> * Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> [220414 02:51]:
> > On Mon, 4 Apr 2022 14:35:26 +0000 Liam Howlett <liam.howlett@xxxxxxxxxx> wrote:
> >
> > > Please add this patch set to your branch. They are based on v5.18-rc1.
> >
> > Do we get a nice [0/n] cover letter telling us all about this?
Here is a cover letter you can use:
The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently. There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface. The first user that is covered in this
patch set is the vm_area_struct, where three data structures are
replaced by the maple tree: the augmented rbtree, the vma cache, and the
linked list of VMAs in the mm_struct. The long term goal is to reduce
or remove the mmap_sem contention.
The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes. With the increased branching factor, it is significantly shorter than
the rbtree so it has fewer cache misses. The removal of the linked list
between subsequent entries also reduces the cache misses and the need to pull
in the previous and next VMA during many tree alterations.
This patch is based on v5.18-rc1
git: https://github.com/oracle/linux-uek/tree/howlett/maple/20220404
v7 changes:
- kernel/fork: Remove unused prev from dup_mmap()
- mm/mmap: Fix unmapped_area_topdown() and expand_downwards()
- mm/mmap: Make vma_store() static
- mm/mmap: Revert to using mmap_write_lock() on exit
- fs/coredump: Fix coredump_next_vma when there is no gate_vma on first
call
- mm/mempolicy: Port Hugh's mempolicy fix
- mm/nommu: Add find_vma_intersection to nommu
- mm/damon: Fix stack size in damon_test_three_regions_in_vmas()
- maple_tree: Fix spanning store detection on ULONG_MAX
- maple_tree: Fix potential metadata off by one in mas_mab_cp() on
final node
- maple_tree: Fix left max on rebalance
- maple_tree: Fix mas_reuse_node() zeroing
- maple_tree: Fix extend_null overflow
- maple_tree: Fix spanning store mast->r->max
- maple_tree: Fix node_size check for slow path in mas_wr_modify()
- maple_tree: Fix spanning store on two node rebalance
- maple_tree: Fix spanning store to a single root node
- maple_tree: Better protect mt_for_each() arguments
- maple_tree: maple_tree: Fix stack size
- maple_tree: Fix mas_dead_node() to work with BE and LE arch
v6: https://lore.kernel.org/linux-mm/20220215143728.3810954-1-Liam.Howlett@xxxxxxxxxx/
v5: https://lore.kernel.org/linux-mm/20220202024137.2516438-1-Liam.Howlett@xxxxxxxxxx/
v4: https://lore.kernel.org/linux-mm/20211201142918.921493-1-Liam.Howlett@xxxxxxxxxx/
v3: https://lore.kernel.org/linux-mm/20211005012959.1110504-1-Liam.Howlett@xxxxxxxxxx/
v2: https://lore.kernel.org/linux-mm/20210817154651.1570984-1-Liam.Howlett@xxxxxxxxxx/
v1: https://lore.kernel.org/linux-mm/20210428153542.2814175-1-Liam.Howlett@xxxxxxxxxx/
> >
> > I have that all merged up and it compiles.
> >
> > https://lkml.kernel.org/r/20220402094550.129-1-lipeifeng@xxxxxxxx and
> > https://lkml.kernel.org/r/20220412081014.399-1-lipeifeng@xxxxxxxx are
> > disabled for now.
> >
> >
> > Several patches were marked
> >
> > From: Liam
> > Signed-off-by: Matthew
> > Signed-off-by: Liam
> >
> > Which makes me wonder whether the attribution was correct. Please
> > double-check.
>
> I'll have a look, thanks.
>
> >
> >
> >
> > I made a lame attempt to fix up mglru's get_next_vma(), and it probably
> > wants a revisit in the maple-tree world anyway. Please check this and
> > send me a better version ;)
>
> What you have below will function, but there is probably a more maple
> way of doing things. Happy to help get the sap flowing - it is that
> time of the year after all ;)
>
> >
> > --- a/mm/vmscan.c~mglru-vs-maple-tree
> > +++ a/mm/vmscan.c
> > @@ -3704,7 +3704,7 @@ static bool get_next_vma(struct mm_walk
> >
> > while (walk->vma) {
> > if (next >= walk->vma->vm_end) {
> > - walk->vma = walk->vma->vm_next;
> > + walk->vma = find_vma(walk->mm, walk->vma->vm_end);
> > continue;
> > }
> >
> > @@ -3712,7 +3712,7 @@ static bool get_next_vma(struct mm_walk
> > return false;
> >
> > if (should_skip_vma(walk->vma->vm_start, walk->vma->vm_end, walk)) {
> > - walk->vma = walk->vma->vm_next;
> > + walk->vma = find_vma(walk->mm, walk->vma->vm_end);
> > continue;
> > }
> >
> > @@ -4062,7 +4062,7 @@ static void walk_mm(struct lruvec *lruve
> > /* the caller might be holding the lock for write */
> > if (mmap_read_trylock(mm)) {
> > unsigned long start = walk->next_addr;
> > - unsigned long end = mm->highest_vm_end;
> > + unsigned long end = TASK_SIZE;
> >
> > err = walk_page_range(mm, start, end, &mm_walk_ops, walk);
> >
> >
> >
> > I flung a tree up at
> > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmtemp containing
> > mglru and mapletree and a few other things. Could the mglru and
> > mapletree people please runtime test it, send any fixes?
>
> Sure thing.
>
> Thanks,
> Liam