Re: [patch] mm: memcontrol: lockless page counters

From: Michal Hocko
Date: Tue Sep 23 2014 - 09:26:01 EST


On Mon 22-09-14 15:58:29, Johannes Weiner wrote:
> On Mon, Sep 22, 2014 at 07:28:00PM +0200, Michal Hocko wrote:
> > On Mon 22-09-14 11:50:49, Johannes Weiner wrote:
> > > On Mon, Sep 22, 2014 at 04:44:36PM +0200, Michal Hocko wrote:
> > > > On Fri 19-09-14 09:22:08, Johannes Weiner wrote:
> > [...]
> > > > Nevertheless I think that the counter should live outside of memcg (it
> > > > is ugly and bad in general to make HUGETLB controller depend on MEMCG
> > > > just to have a counter). If you made kernel/page_counter.c and led both
> > > > containers select CONFIG_PAGE_COUNTER then you do not need a dependency
> > > > on MEMCG and I would find it cleaner in general.
> > >
> > > The reason I did it this way is because the hugetlb controller simply
> > > accounts and limits a certain type of memory and in the future I would
> > > like to make it a memcg extension, just like kmem and swap.
> >
> > I am not sure this is the right way to go. Hugetlb has always been
> > "special" and I do not see any advantage to pull its specialness into
> > memcg proper.
> >
> > It would just make the code more complicated. I can also imagine
> > users who simply do not want to pay memcg overhead and use only
> > hugetlb controller.
>
> We already group user memory, kernel memory, and swap space together,
> what makes hugetlb-backed memory special?

There is only a little overlap between LRU backed and kmem accounted
memory with hugetlb which has always been standing aside from the rest
of the memory management code (THP being a successor which fits in much
better and which is already covered by memcg). It has basically its own
code path for every aspect of its object life cycle and internal data
structures which are in many ways not compatible with regular user or
kmem memory. Merging the controllers would require to merge hugetlb code
closer the MM code. Until then it just doesn't make sense to me.

> It's much easier to organize the code if all those closely related
> things are grouped together.

Could you be more specific please? How can pulling hugetlb details would
help other !hugetlb code paths?

> It's also better for the user interface to have a single memory
> controller.

I have seen so much confusion coming from hugetlb vs. THP that I think
the quite opposite is true. Besides that we would need a separate limit
for hugetlb accounted memory anyway so having a small and specialized
controller for specialized memory sounds like a proper way to go.

Finally, as mentioned in previous email, you might have users interested
only in hugetlb controller with memcg disabled.

> We're also close to the point where we don't differentiate between the
> root group and dedicated groups in terms of performance, Dave's tests
> fell apart at fairly high concurrency, and I'm already getting rid of
> the lock he saw contended.

Sure but this has nothing to do with it. Hugetlb can safely use the same
lockless counter as a replacement for res_counter and benefit from it
even though the contention hasn't been seen/reported yet.

> The downsides of fragmenting our configuration- and testspace, our
> user interface, and our code base by far outweigh the benefits of
> offering a dedicated hugetlb controller.

Could you be more specific please? Hugetlb has to be configured and
tested separately whether it would be in a separate controller or not.

Last but not least, even if this turns out to make some sense in
the future please do not mix those things together here. Your
res_counter -> page_counter transition makes a lot of sense for both
controllers. And it is a huge improvement. I do not see any reason
to pull a conceptually nontrivial merging/dependency of two separate
controllers into the picture. If you think it makes some sense then
bring that up later for a separate discussion.

Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/