Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

From: Paul E. McKenney
Date: Wed Jul 13 2016 - 17:42:40 EST


On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> > > Take the patch that I just sent out and make the choice of normal
> > > vs. expedited depend on CONFIG_PREEMPT_RT or whatever the -rt guys are
> > > calling it these days. Is there a low-latency Kconfig option other
> > > than CONFIG_NO_HZ_FULL?
> >
> > Sounds like a plan to me.
>
> I like the way we like each other's idea. Mutually assured laziness? ;-)

But here is what mine might look like. Untested, probably does
not even build. Note that the default is -no- expediting, use the
rcusync.expedited kernel parameter to enable it.

Thanx, Paul

------------------------------------------------------------------------

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c958d1c..b8bc9854e548 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3229,6 +3229,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
energy efficiency by requiring that the kthreads
periodically wake up to do the polling.

+ rcusync.expedited [KNL]
+ Specify that the rcusync mechanism use expedited
+ grace periods. As of mid-2016, this affects
+ per-CPU rwsems.
+
rcutree.blimit= [KNL]
Set maximum number of finished RCU callbacks to
process in one batch.
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index be922c9f3d37..5bc5bef2e00a 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -22,6 +22,14 @@

#include <linux/rcu_sync.h>
#include <linux/sched.h>
+#include <linux/moduleparam.h>
+#include <linux/module.h>
+
+MODULE_ALIAS("rcusync");
+#ifdef MODULE_PARAM_PREFIX
+#undef MODULE_PARAM_PREFIX
+#endif
+#define MODULE_PARAM_PREFIX "rcusync."

#ifdef CONFIG_PROVE_RCU
#define __INIT_HELD(func) .held = func,
@@ -29,7 +37,7 @@
#define __INIT_HELD(func)
#endif

-static const struct {
+static struct {
void (*sync)(void);
void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
void (*wait)(void);
@@ -62,6 +70,20 @@ enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };

#define rss_lock gp_wait.lock

+static bool expedited;
+module_param(expedited, bool, 0444);
+
+static int __init rcu_sync_early_init(void)
+{
+ if (expedited) {
+ gp_ops[RCU_SYNC].sync = synchronize_rcu_expedited;
+ gp_ops[RCU_SCHED_SYNC].sync = synchronize_sched_expedited;
+ gp_ops[RCU_BH_SYNC].sync = synchronize_rcu_bh_expedited;
+ }
+ return 0;
+}
+early_initcall(rcu_sync_early_init);
+
#ifdef CONFIG_PROVE_RCU
void rcu_sync_lockdep_assert(struct rcu_sync *rsp)
{