Re: [PATCH 0/5] memcg/kmem: switch to white list policy

From: Vladimir Davydov
Date: Mon Nov 09 2015 - 14:28:02 EST

On Mon, Nov 09, 2015 at 01:54:01PM -0500, Tejun Heo wrote:
> On Mon, Nov 09, 2015 at 09:28:40PM +0300, Vladimir Davydov wrote:
> > > I am _all_ for this semantic I am just not sure what to do with the
> > > legacy kmem controller. Can we change its semantic? If we cannot do that
> >
> > I think we can. If somebody reports a "bug" caused by this change, i.e.
> > basically notices that something that used to be accounted is not any
> > longer, it will be trivial to fix by adding __GFP_ACCOUNT where
> > appropriate. If it is not, e.g. if accounting of objects of a particular
> > type leads to intense false-sharing, we would end up disabling
> > accounting for it anyway.
> I agree too, if anything is meaningfully broken by the flip, it just
> indicates that the whitelist needs to be expanded; however, I wonder
> whether this would be done better at slab level rather than per
> allocation site.

I'd like to, but this is not as simple as it seems at first glance. The
problem is that slab caches of the same size are actively merged with
each other. If we just added SLAB_ACCOUNT flag, which would be passed to
kmem_cache_create to enable accounting, we'd divide all caches into two
groups that couldn't be merged with each other even if kmem accounting
was not used at all. This would be a show stopper.

Of course, we could rework slab merging so that kmem_cache_create
returned a new dummy cache even if it was actually merged. Such a cache
would point to the real cache, which would be used for allocations. This
wouldn't limit slab merging, but this would add one more dereference to
alloc path, which is even worse.

That's why I decided to go with marking individual allocations.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at