Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

From: Glauber Costa
Date: Wed Dec 04 2013 - 05:09:04 EST


>>
>> Could you do something clever with just one flag? Probably yes. But I
>> doubt it would
>> be that much cleaner, this is just the way that patching sites work.
>
> Thank you for spending your time to listen to me.
>
Don't worry! I thank you for carrying this forward.

> Let me try to explain what is bothering me.
>
> We have two state bits for each memcg, 'active' and 'activated'. There
> are two scenarios where the bits can be modified:
>
> 1) The kmem limit is set on a memcg for the first time -
> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
> which sets the 'activated' bit on success, then update static branching,
> then set the 'active' bit. All three actions are done atomically in
> respect to other tasks setting the limit due to the set_limit_mutex.
> After both bits are set, they never get cleared for the memcg.
>
So far so good. But again, note how you yourself describe it:
the cations are done atomically *in respect to other tasks setting the limit*

But there are also tasks that are running its courses naturally and
just allocating
memory. For those, some call sites will be on, some will be off. We need to make
sure that *none* of them uses the patched site until *all* of them are patched.
This has nothing to do with updates, this is all about the readers.

> 2) When a subgroup of a kmem-active cgroup is created -
> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
> then increase static branching refcounter, then call
> memcg_update_cache_sizes() for the new memcg, which may clear the
> 'activated' bit on failure. After successful execution, the state bits
> never get cleared for the new memcg.
>
> In scenario 2 there is no need bothering about the flags setting order,
> because we don't have any tasks in the cgroup yet - the tasks can be
> moved in only after css_online finishes when we have both of the bits
> set and the static branching enabled. Actually, we already do not bother
> about it, because we have both bits set before the cgroup is fully
> initialized (memcg_update_cache_sizes() is called).
>
Yes, after the first cgroup is set none of that matters. But it is just easier
and less error prone just to follow the same path every time. As I have said,
if you can come up with a more clever way to deal with the problem above
that doesn't involve the double flag - and you can prove it works - I
am definitely
fine with it. But this is subtle code, and in the past - Michal can
attest this - we've
changed this being sure it would work just to see it explode in our faces.

So although I am willing to review every patch for correctness on that
front (I never
said I liked the 2-flags scheme...), unless you have a bug or real
problem on it,
I would advise against changing it if its just to make it more readable.

But again, don't take me too seriously on this. If you and Michal think you can
come up with something better, I'm all for it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/