Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()

From: Joel Fernandes
Date: Mon Oct 31 2022 - 14:15:52 EST


On Mon, Oct 31, 2022 at 9:21 AM Uladzislau Rezki <urezki@xxxxxxxxx> wrote:
>
> On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > >
> > > > > You guys might need to agree on the definition of "good" here. Or maybe
> > > > > understand the differences in your respective platforms' definitions of
> > > > > "good". ;-)
> > > > >
> > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > >
> > > > <snip>
> > > > 1. Home screen swipe:
[...]
> > > > 2. App launches:
[...]
> > > > <snip>
> > > >
> > > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > > some time ago improving kvfree_rcu() batching:
> > > >
> > > > <snip>
> > > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > > Author: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
> > > > Date: Thu Jun 30 18:33:35 2022 +0200
> > > >
> > > > rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > >
> > > > Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > > > which is roughly 50 milliseconds. The drawback of this approach is
> > > > low utilization of the 512 page slots in scenarios with infrequence
> > > > kvfree_rcu() calls. For example on an Android system:
> > > > <snip>
> > > >
> > > > The trace that i posted was taken without it.
> > >
> > > And if I am not getting too confused, that patch is now in mainline.
> > > So it does make sense to rely on it, then. ;-)
> >
> > Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
> > to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
> >
> This is very good.
>
> Below is a plot that i have taken during one use-case. It is about three
> apps usage in parallel. It was done by running "monkey" test:
>
> wget ftp://vps418301.ovh.net/incoming/monkey_3_apps_slab_usage_5_minutes.png
>
> i set up three apps as usage scenario: Google Chrome, YoTube and Camera.
> I logged the Slab metric from the /proc/meminfo. Sampling rate is 0.1 second.
>
> Please have a look at results. It reflects what i am saying. non-flush
> kvfree RCU variant makes a memory usage higher. What is not acceptable
> for our mobile devices and workloads.

That does look higher, though honestly about ~5%. But that's just the
effect of more "laziness". The graph itself does not show a higher
number of shrinker invocations, in fact I think shrinker invocations
are not happening much that's why the slab holds more memory. The
system may not be under memory pressure?

Anyway, I agree with your point of view and I think my concern does
not even occur with the latest patch on avoiding RCU that I posted
[1], so I come in peace.

[1] https://lore.kernel.org/rcu/20221029132856.3752018-1-joel@xxxxxxxxxxxxxxxxx/

I am going to start merging all the lazy patches to ChromeOS 5.10 now
including your kfree updates, except for [1] while we discuss it.