Re: [PATCH net] Revert "defer call to mem_cgroup_sk_alloc()"

From: Eric Dumazet
Date: Fri Feb 02 2018 - 12:59:58 EST


On Fri, 2018-02-02 at 16:57 +0000, Roman Gushchin wrote:
> This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
> defer call to mem_cgroup_sk_alloc()").
>
> Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
> memcg socket memory accounting, as packets received before memcg
> pointer initialization are not accounted and are causing refcounting
> underflow on socket release.
>
> Actually the free-after-use problem was fixed by
> commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
> sk_clone_lock()") for the cgroup pointer.
>
> So, let's revert it and call mem_cgroup_sk_alloc() just before
> cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
> we're cloning, and it holds a reference to the memcg.
>
> Signed-off-by: Roman Gushchin <guro@xxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> Cc: David S. Miller <davem@xxxxxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> ---
> mm/memcontrol.c | 14 ++++++++++++++
> net/core/sock.c | 5 +----
> net/ipv4/inet_connection_sock.c | 1 -
> 3 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 0ae2dc3a1748..0937f2c52c7d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5747,6 +5747,20 @@ void mem_cgroup_sk_alloc(struct sock *sk)
> if (!mem_cgroup_sockets_enabled)
> return;
>
> + /*
> + * Socket cloning can throw us here with sk_memcg already
> + * filled. It won't however, necessarily happen from
> + * process context. So the test for root memcg given
> + * the current task's memcg won't help us in this case.
> + *
> + * Respecting the original socket's memcg is a better
> + * decision in this case.
> + */
> + if (sk->sk_memcg) {

Original commit had a BUG_ON(mem_cgroup_is_root(sk->sk_memcg));

I presume it is no longer useful ?

Thanks

> + css_get(&sk->sk_memcg->css);
> + return;
> + }
> +
> rcu_read_lock();