Re: [PATCH v4 1/2] rcu/tree: Add basic support for kfree_rcu() batching

From: Uladzislau Rezki
Date: Tue Oct 08 2019 - 12:23:19 EST

On Fri, Oct 04, 2019 at 01:20:38PM -0400, Joel Fernandes wrote:
> On Tue, Oct 01, 2019 at 01:27:02PM +0200, Uladzislau Rezki wrote:
> [snip]
> > > > I have just a small question related to workloads and performance evaluation.
> > > > Are you aware of any specific workloads which benefit from it for example
> > > > mobile area, etc? I am asking because i think about backporting of it and
> > > > reuse it on our kernel.
> > >
> > > I am not aware of a mobile usecase that benefits but there are server
> > > workloads that make system more stable in the face of a kfree_rcu() flood.
> > >
> > OK, i got it. I wanted to test it finding out how it could effect mobile
> > workloads.
> >
> > >
> > > For the KVA allocator work, I see it is quite similar to the way binder
> > > allocates blocks. See function: binder_alloc_new_buf_locked(). Is there are
> > > any chance to reuse any code? For one thing, binder also has an rbtree for
> > > allocated blocks for fast lookup of allocated blocks. Does the KVA allocator
> > > not have the need for that?
> > >
> > Well, there is a difference. Actually the free blocks are not sorted by
> > the its size like in binder layer, if understand the code correctly.
> >
> > Instead, i keep them(free blocks) sorted(by start address) in ascending
> > order + maintain the augment value(biggest free size in left or right sub-tree)
> > for each node, that allows to navigate toward the lowest address and the block
> > that definitely suits. So as a result our allocations become sequential
> > what is important.
> Right, I realized this after sending the email that binder and kva sort
> differently though they both try to use free sizes during the allocation.
> Would you have any papers, which survey various rb-tree based allocator
> algorithms and their tradeoffs? I am interested in studying these more
> especially in relation to the binder driver. Would also be nice to make
> contributions to papers surveying both these allocators to describe the state
> of the art.
So far i have not had any paper with different kind of comparison. But
that is interested for sure, especially to analyze the model for example
based on B-Tree, so when we can fully utilize a cache performance.
Because regular binary trees are just pointer chasing.

As for binder driver and its allocator, is it O(lognN) complexity? Is
there any bottleneck in its implementation?


Vlad Rezki