Re: [PATCH 09/24] rcu/tree: cache specified number of objects
From: Uladzislau Rezki
Date: Mon May 04 2020 - 13:48:30 EST
On Mon, May 04, 2020 at 08:24:37AM -0700, Paul E. McKenney wrote:
> On Mon, May 04, 2020 at 02:43:23PM +0200, Uladzislau Rezki wrote:
> > On Fri, May 01, 2020 at 02:27:49PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 28, 2020 at 10:58:48PM +0200, Uladzislau Rezki (Sony) wrote:
> > > > Cache some extra objects per-CPU. During reclaim process
> > > > some pages are cached instead of releasing by linking them
> > > > into the list. Such approach provides O(1) access time to
> > > > the cache.
> > > >
> > > > That reduces number of requests to the page allocator, also
> > > > that makes it more helpful if a low memory condition occurs.
> > > >
> > > > A parameter reflecting the minimum allowed pages to be
> > > > cached per one CPU is propagated via sysfs, it is read
> > > > only, the name is "rcu_min_cached_objs".
> > > >
> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
> > > > ---
> > > > kernel/rcu/tree.c | 64 ++++++++++++++++++++++++++++++++++++++++++++---
> > > > 1 file changed, 60 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 89e9ca3f4e3e..d8975819b1c9 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -178,6 +178,14 @@ module_param(gp_init_delay, int, 0444);
> > > > static int gp_cleanup_delay;
> > > > module_param(gp_cleanup_delay, int, 0444);
> > > >
> > > > +/*
> > > > + * This rcu parameter is read-only, but can be write also.
> > >
> > > You mean that although the parameter is read-only, you see no reason
> > > why it could not be converted to writeable?
> > >
> > I added just a note. If it is writable, then we can change the size of the
> > per-CPU cache dynamically, i.e. "echo 5 > /sys/.../rcu_min_cached_objs"
> > would cache 5 pages. But i do not have a strong opinion if it should be
> > writable.
> >
> > > If it was writeable, and a given CPU had the maximum numbr of cached
> > > objects, the rcu_min_cached_objs value was decreased, but that CPU never
> > > saw another kfree_rcu(), would the number of cached objects change?
> > >
> > No. It works the way: unqueue the page from cache in the kfree_rcu(),
> > whereas "rcu work" will put it back if number of objects < rcu_min_cached_objs,
> > if >= will free the page.
>
> Just to make sure I understand... If someone writes a smaller number to
> the sysfs variable, the per-CPU caches will be decreased at that point,
> immediately during that sysfs write? Or are you saying something else?
>
This patch defines it as read-only. It defines the minimum threshold that
controls number of elements in the per-CPU cache. If we decide to make it
write also, then we will have full of freedom how to define its behavior,
i.e. it is not defined because it is read only.
> > > Presumably the list can also be accessed without holding this lock,
> > > because otherwise we shouldn't need llist...
> > >
> > Hm... We increase the number of elements in cache, therefore it is not
> > lockless. From the other hand i used llist_head to maintain the cache
> > because it is single linked list, we do not need "*prev" link. Also
> > we do not need to init the list.
> >
> > But i can change it to list_head. Please let me know if i need :)
>
> Hmmm... Maybe it is time for a non-atomic singly linked list? In the RCU
> callback processing, the operations were open-coded, but they have been
> pushed into include/linux/rcu_segcblist.h and kernel/rcu/rcu_segcblist.*.
>
> Maybe some non-atomic/protected/whatever macros in the llist.h file?
> Or maybe just open-code the singly linked list? (Probably not the
> best choice, though.) Add comments stating that the atomic properties
> of the llist functions aren't neded? Something else?
>
In order to keep it simple i can replace llist_head by the list_head?
>
> The comments would be a good start. Just to take pity on people seeing
> the potential for concurrency and wondering how the concurrent accesses
> actually happen. ;-)
>
Sounds like you are kidding me :)
--
Vlad Rezki