Re: Cgroup memory barrier usage and call frequency from scheduler

From: Peter Zijlstra
Date: Thu Apr 09 2020 - 12:49:32 EST


On Thu, Apr 09, 2020 at 04:44:13PM +0100, Mel Gorman wrote:

> For 1, the use of a full barrier seems unnecessary when it appears that
> you could have used a read barrier and a write barrier. The following
> patch drops the profile overhead to 0.1%

Yikes. And why still .1% the below should be a barrier() on x86. Is the
compiler so contrained by that?

> diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
> index ca19b4c8acf5..bc3125949b4b 100644
> --- a/kernel/cgroup/rstat.c
> +++ b/kernel/cgroup/rstat.c
> @@ -36,7 +36,7 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)
> * Paired with the one in cgroup_rstat_cpu_pop_upated(). Either we
> * see NULL updated_next or they see our updated stat.
> */
> - smp_mb();
> + smp_rmb();
>
> /*
> * Because @parent's updated_children is terminated with @parent
> @@ -139,7 +139,7 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos,
> * Either they see NULL updated_next or we see their
> * updated stat.
> */
> - smp_mb();
> + smp_wmb();
>
> return pos;
> }