[PATCH -next v3 0/9] rbtree: Cache leftmost node internally

From: Davidlohr Bueso
Date: Thu Jun 29 2017 - 13:18:41 EST


Changes from v2 (https://lkml.org/lkml/2017/6/8/857):
- Fixed 0day reported crash for drm_mm selftest program. We were
not correctly using the cached version of rbtree with the allocated
nodes.
- Added cfq patch to use internal rbtree caching.
- Added Christian's and Jan's reviews.

Changes from v1 (https://marc.info/?l=linux-kernel&m=149611025616685):
- No longer rfc.
- Removed bogus semimcolon in rb_first_cached()
- Updated missing interval tree user drivers/infiniband/hw/hfi1/
- Removed redundant @cached arg in when erasing a node.
- Added more patches that make use of rb_first_cached(), which I
thought might be worth it: procfs and epoll.
- Cc more people for patch 5, which touches drivers such as infiniband
and gpu. The rest of the changes are pretty covered with the current
Cc'ed maintainers and mm folks.

Hi,

Here's a proposal for extending rbtrees to internally cache the leftmost
node such that we can have fast overlap check optimization for all interval
tree users[1]. The benefits of this series are that:

(i) Unify users that do internal leftmost node caching.
(ii) Optimize all interval tree users.
(iii) Convert at least two new users (epoll and procfs) to the new interface.

Patch 1: Layout the rb machinery.

Patches 2-5: Make use of the internal leftmost node in scheduler and
rt mutexes and cfq.

Patch 6: Implements fast overlap checks for interval trees.

Patch 7: rocket science.

Patches 8,9: New patches that convert to O(1) rb_first_cached().

The series has survived booting, kernel builds and pistress workloads.

Ingo, I know it's late in the game, but could it be considered for
v4.13? Given that v2 has been there a while and there are no issues
currently. Applies on top of today's -next.

Thanks!

Davidlohr Bueso (9):
rbtree: Cache leftmost node internally
sched/fair: Replace cfs_rq->rb_leftmost
sched/deadline: Replace earliest dl and rq leftmost caching
locking/rtmutex: Replace top-waiter and pi_waiters leftmost caching
block/cfq: Replace cfq_rb_root leftmost caching
lib/interval_tree: Fast overlap detection
lib/interval-tree: Correct comment wrt generic flavor
procfs: Use faster rb_first_cached()
fs/epoll: Use faster rb_first_cached()

block/cfq-iosched.c | 70 +++++++---------------
drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 8 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
drivers/gpu/drm/drm_mm.c | 19 +++---
drivers/gpu/drm/drm_vma_manager.c | 2 +-
drivers/gpu/drm/i915/i915_gem_userptr.c | 6 +-
drivers/gpu/drm/radeon/radeon.h | 2 +-
drivers/gpu/drm/radeon/radeon_mn.c | 8 +--
drivers/gpu/drm/radeon/radeon_vm.c | 7 ++-
drivers/infiniband/core/umem_rbtree.c | 4 +-
drivers/infiniband/core/uverbs_cmd.c | 2 +-
drivers/infiniband/hw/hfi1/mmu_rb.c | 10 ++--
drivers/infiniband/hw/usnic/usnic_uiom.c | 6 +-
drivers/infiniband/hw/usnic/usnic_uiom.h | 2 +-
.../infiniband/hw/usnic/usnic_uiom_interval_tree.c | 15 ++---
.../infiniband/hw/usnic/usnic_uiom_interval_tree.h | 12 ++--
drivers/vhost/vhost.c | 2 +-
drivers/vhost/vhost.h | 2 +-
fs/eventpoll.c | 30 +++++-----
fs/hugetlbfs/inode.c | 6 +-
fs/inode.c | 2 +-
fs/proc/generic.c | 26 ++++----
fs/proc/internal.h | 2 +-
fs/proc/proc_net.c | 2 +-
fs/proc/root.c | 2 +-
include/drm/drm_mm.h | 2 +-
include/linux/fs.h | 4 +-
include/linux/init_task.h | 5 +-
include/linux/interval_tree.h | 8 ++-
include/linux/interval_tree_generic.h | 48 +++++++++++----
include/linux/mm.h | 17 +++---
include/linux/rbtree.h | 11 ++++
include/linux/rbtree_augmented.h | 33 +++++++++-
include/linux/rmap.h | 4 +-
include/linux/rtmutex.h | 11 ++--
include/linux/sched.h | 3 +-
include/rdma/ib_umem_odp.h | 11 ++--
include/rdma/ib_verbs.h | 2 +-
kernel/fork.c | 3 +-
kernel/locking/rtmutex-debug.c | 2 +-
kernel/locking/rtmutex.c | 35 ++++-------
kernel/locking/rtmutex_common.h | 12 ++--
kernel/sched/deadline.c | 50 ++++++----------
kernel/sched/debug.c | 2 +-
kernel/sched/fair.c | 35 ++++-------
kernel/sched/sched.h | 9 +--
lib/interval_tree_test.c | 4 +-
lib/rbtree.c | 34 +++++++++--
mm/interval_tree.c | 10 ++--
mm/memory.c | 4 +-
mm/mmap.c | 10 ++--
mm/rmap.c | 4 +-
53 files changed, 330 insertions(+), 299 deletions(-)

--
2.12.0