Re: [PATCH 2/2] memcg: barriers to see memcgs as fully initialized

From: Michal Hocko
Date: Thu Feb 13 2014 - 09:53:24 EST


On Wed 12-02-14 17:29:09, Hugh Dickins wrote:
> Commit d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully
> initialized") is not bad, but Greg Thelen asks "Are barriers needed?"
>
> Yes, I'm afraid so: this makes it a little heavier than the original,
> but there's no point in guaranteeing that mem_cgroup_iter() returns only
> fully initialized memcgs, if we don't guarantee that the initialization
> is visible.
>
> If we move online_css()'s setting CSS_ONLINE after rcu_assign_pointer()
> (I don't see why not), we can reasonably rely on the smp_wmb() in that.
> But I can't find a pre-existing barrier at the mem_cgroup_iter() end,
> so add an smp_rmb() where __mem_cgroup_iter_next() returns non-NULL.
>
> Fixes: d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully initialized")
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # 3.12+
> ---
> I'd have been happier not to have to add this patch: maybe you can see
> a better placement, or a way we can avoid this altogether.

I don't know. I have thought about this again and I really do not see
why we have to provide such a guarantee, to be honest.

Such a half initialized memcg wouldn't see its hierarchical parent
properly (including inheritted attributes) and it wouldn't have kmem
fully initialized. But it also wouldn't have any tasks in it IIRC so it
shouldn't matter much.

So I really don't know whether this all is worth all the troubles.
I am not saying your patch is wrong (although I am not sure whether
css->flags vs. subsystem css association ordering is relevant and
ae7f164a09408 changelog didn't help me much) and it made sense when
you proposed it back then but the additional ordering requirements
complicates the thing.

I will keep thinking about that.

> kernel/cgroup.c | 8 +++++++-
> mm/memcontrol.c | 11 +++++++++--
> 2 files changed, 16 insertions(+), 3 deletions(-)
>
> --- 3.14-rc2+/kernel/cgroup.c 2014-02-02 18:49:07.737302111 -0800
> +++ linux/kernel/cgroup.c 2014-02-12 11:59:52.804041895 -0800
> @@ -4063,9 +4063,15 @@ static int online_css(struct cgroup_subs
> if (ss->css_online)
> ret = ss->css_online(css);
> if (!ret) {
> - css->flags |= CSS_ONLINE;
> css->cgroup->nr_css++;
> rcu_assign_pointer(css->cgroup->subsys[ss->subsys_id], css);
> + /*
> + * Set CSS_ONLINE after rcu_assign_pointer(), so that its
> + * smp_wmb() will guarantee that those seeing CSS_ONLINE
> + * can see the initialization done in ss->css_online() - if
> + * they provide an smp_rmb(), as in __mem_cgroup_iter_next().
> + */
> + css->flags |= CSS_ONLINE;
> }
> return ret;
> }
> --- 3.14-rc2+/mm/memcontrol.c 2014-02-12 11:55:02.836035004 -0800
> +++ linux/mm/memcontrol.c 2014-02-12 11:59:52.804041895 -0800
> @@ -1128,9 +1128,16 @@ skip_node:
> */
> if (next_css) {
> if ((next_css == &root->css) ||
> - ((next_css->flags & CSS_ONLINE) && css_tryget(next_css)))
> + ((next_css->flags & CSS_ONLINE) && css_tryget(next_css))) {
> + /*
> + * Ensure that all memcg initialization, done before
> + * CSS_ONLINE was set, will be visible to our caller.
> + * This matches the smp_wmb() in online_css()'s
> + * rcu_assign_pointer(), before it set CSS_ONLINE.
> + */
> + smp_rmb();
> return mem_cgroup_from_css(next_css);
> -
> + }
> prev_css = next_css;
> goto skip_node;
> }
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/