Re: [PATCH 3/8] net: consolidate memcg socket buffer tracking and accounting
From: Vladimir Davydov
Date: Fri Oct 23 2015 - 09:43:20 EST
On Thu, Oct 22, 2015 at 03:09:43PM -0400, Johannes Weiner wrote:
> On Thu, Oct 22, 2015 at 09:46:12PM +0300, Vladimir Davydov wrote:
> > On Thu, Oct 22, 2015 at 12:21:31AM -0400, Johannes Weiner wrote:
> > > The tcp memory controller has extensive provisions for future memory
> > > accounting interfaces that won't materialize after all. Cut the code
> > > base down to what's actually used, now and in the likely future.
> > >
> > > - There won't be any different protocol counters in the future, so a
> > > direct sock->sk_memcg linkage is enough. This eliminates a lot of
> > > callback maze and boilerplate code, and restores most of the socket
> > > allocation code to pre-tcp_memcontrol state.
> > >
> > > - There won't be a tcp control soft limit, so integrating the memcg
> >
> > In fact, the code is ready for the "soft" limit (I mean min, pressure,
> > max tuple), it just lacks a knob.
>
> Yeah, but that's not going to materialize if the entire interface for
> dedicated tcp throttling is considered obsolete.
May be, it shouldn't be. My current understanding is that per memcg tcp
window control is necessary, because:
- We need to be able to protect a containerized workload from its
growing network buffers. Using vmpressure notifications for that does
not look reassuring to me.
- We need a way to limit network buffers of a particular container,
otherwise it can fill the system-wide window throttling other
containers, which is unfair.
>
> > > @@ -1136,9 +1090,6 @@ static inline bool sk_under_memory_pressure(const struct sock *sk)
> > > if (!sk->sk_prot->memory_pressure)
> > > return false;
> > >
> > > - if (mem_cgroup_sockets_enabled && sk->sk_cgrp)
> > > - return !!sk->sk_cgrp->memory_pressure;
> > > -
> >
> > AFAIU, now we won't shrink the window on hitting the limit, i.e. this
> > patch subtly changes the behavior of the existing knobs, potentially
> > breaking them.
>
> Hm, but there is no grace period in which something meaningful could
> happen with the window shrinking, is there? Any buffer allocation is
> still going to fail hard.
AFAIU when we hit the limit, we not only throttle the socket which
allocates, but also try to release space reserved by other sockets.
After your patch we won't. This looks unfair to me.
Thanks,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/