Re: [PATCH v5] Soft limit rework

From: Johannes Weiner
Date: Tue Sep 17 2013 - 15:56:39 EST


On Mon, Sep 16, 2013 at 06:44:05PM +0200, Michal Hocko wrote:
> On Fri 13-09-13 12:17:09, Johannes Weiner wrote:
> > On Fri, Sep 13, 2013 at 04:49:53PM +0200, Michal Hocko wrote:
> > > On Fri 06-09-13 15:23:11, Johannes Weiner wrote:
> [...]
> > > > I would really like to deprecate soft limits and introduce something
> > > > new that has the proper semantics we want from the get-go. Its
> > > > implementation could very much look like your code, so we can easily
> > > > reuse that. But the interface and its semantics should come first.
> > >
> > > I am open to discussin such a change I just do not see any reason to
> > > have a crippled soft reclaim implementation for the mean time.
> > > Especially when it doesn't look like such a new interface is easy to
> > > agree on.
> >
> > We had a crippled soft limit implementation from the time it was
> > merged, it never worked better than now.
> >
> > You seem to think that this is an argument *for* finally fixing it. I
> > disagree. We should absolutely *avoid* steering people toward it now,
> > when the long term plan is already to get rid of it.
>
> It is not just about fixing it. It is also about getting rid of the
> bloat that the previous implementation depended on. Not only LoC but
> also the resulting binary:
>
> $ size mm/built-in.o
> base:
> text data bss dec hex filename
> 534283 233703 163456 931442 e3672 mm/built-in.o
> rework:
> text data bss dec hex filename
> 532866 229591 163456 925913 e20d9 mm/built-in.o
>
> I would like to get rid of as much of a special code as possible.
> Especially this one which I hate personally because it is a crude hack
> that shouldn't have existed.

The interface is, not the implementation.

> > There is a concensus that cgroups and the controllers were merged
> > before they were ready and we are now struggling heavily to iron out
> > the design mistakes with the minimum amount of disruption we can get
> > away with.
> >
> > We are also at this time coordinating with all the other controllers
> > and the cgroup core to do exactly that, where Tejun is providing us
> > with tools to revamp the problematic interfaces.
> >
> > And we agree that soft limits were such a design mistake that should
> > be ironed out.
> >
> > So for the love of everything we hold dear, why would you think that
> > NOW is a good time to fix the implemantion and get people to use it?
>
> There are users who already use this feature and it will take some (read
> a lot of) time to move them to something else. And that something else
> still doesn't exist and I suspect it will take some time to push it into
> a proper shape (and be sure we do not screw it this time).

Yet you could not name a single sensible use case. So no, we have no
known users, except maybe Google, who actually want guarantees.

> So while I agree that we need something more (semantically) reasonable
> there is no need to keep this crippled implementation around especially
> when it is non-trivial amount of code.

You traded a non-trivial amount of code for non-trivial code. And you
moved the cost from optional memcg code to essential memory management
code. All for a feature that needs to be redesigned and has a
questionable userbase.

> > > > > > You have not shown that prio-0 scans are a problem.
> > > > >
> > > > > OK, I thought this was self evident but let me be more specific.
> > > > >
> > > > > The scan the world is almost always a problem. We are no longer doing
> > > > > proportional anon/file reclaim (swappiness is ignored). This is wrong
> > > > > from at least two points of view. Firstly it makes the reclaim decisions
> > > > > different a lot for groups that are under the soft limit and those
> > > > > that are over. Secondly, and more importantly, this might lead to a
> > > > > pre-mature swapping, especially when there is a lot of IO going on.
> > > > >
> > > > > The global reclaim suffers from the very same problem and that is why
> > > > > we try to prevent from prio-0 reclaim as much as possible and use it
> > > > > only as a last resort.
> > > >
> > > > I know that and I can see that this should probably be fixed, but
> > > > there is no quantification for this. We have no per-memcg reclaim
> > > > statistics
> > >
> > > Not having statistic is a separate issue. It makes the situation worse
> > > but that is not a new thing. The old implementation is even worse
> > > because the soft reclaim activity is basically hidden from global
> > > reclaim counters. So a lot of pages might get scanned and we will have
> > > no way to find out. That part is inherently fixed by the series because
> > > of the integration.
> >
> > Because it's in the *global* reclaim counters? That's great but it
> > does not address the problem at all. This is about pressure balance
> > between groups and you don't have any numbers for that.
>
> yes, but the point was that if somebody uses soft reclaim currently you
> would miss a big part of reclaim activity because soft reclaim is not
> accounted even in the global counters. So you can see a long stall
> during direct reclaim while the counters look all good.

This is not going anywhere :(

> > This series is a grab bag of fixes that drag a lot of complexity from
> > memcg code into generic reclaim, to repair the age old implementation
> > of a user-visible interface that we already agree sucks balls and
> > should be deprecated. The fact that you did not even demonstrate that
> > the repair itself was successful is a secondary issue at this point,
> > but it certainly didn't help your case.
>
> I will follow up with the testing results later. I hope I manage to have
> them before I leave on vacation.
>
> The question, though, is whether even results supporting my claims about
> enhancements would make any difference. To be honest I thought that the
> integration would be non-controversial topic even without performance
> improvements which could have be expected due to removing prio-0 reclaim
> which is a pure evil. Also the natural fairness of the s.r. sounds like
> a good thing.
>
> Anyway. Your wording that nothing should be done about the soft reclaim
> seems to be quite clear though. If this position is really firm then go
> ahead and NACK the series _explicitly_ so that Andrew or you can send a
> revert request to Linus. I would really like to not waste a lot of time
> on testing right now when it wouldn't lead to anything.

Nacked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/