Re: [PATCH 03/10] mm: Assign memcg-aware shrinkers bitmap to memcg

From: Vladimir Davydov
Date: Tue Mar 27 2018 - 06:00:59 EST


On Mon, Mar 26, 2018 at 06:29:05PM +0300, Kirill Tkhai wrote:
> >> @@ -182,6 +187,9 @@ struct mem_cgroup {
> >> unsigned long low;
> >> unsigned long high;
> >>
> >> + /* Bitmap of shrinker ids suitable to call for this memcg */
> >> + struct shrinkers_map __rcu *shrinkers_map;
> >> +
> >
> > We keep all per-node data in mem_cgroup_per_node struct. I think this
> > bitmap should be defined there as well.
>
> But them we'll have to have struct rcu_head for every node to free the map
> via rcu. This is the only reason I did that. But if you think it's not a problem,
> I'll agree with you.

I think it's OK. It'd be consistent with how list_lru handles
list_lru_memcg reallocations.

> >> @@ -4487,6 +4490,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> >> struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> >> struct mem_cgroup_event *event, *tmp;
> >>
> >> + free_shrinker_maps(memcg);
> >> +
> >
> > AFAIU this can race with shrink_slab accessing the map, resulting in
> > use-after-free. IMO it would be safer to free the bitmap from css_free.
>
> But doesn't shrink_slab() iterate only online memcg?

Well, yes, shrink_slab() bails out if the memcg is offline, but I
suspect there might be a race condition between shrink_slab and
css_offline when shrink_slab calls shrinkers for an offline cgroup.

>
> >> /*
> >> * Unregister events and notify userspace.
> >> * Notify userspace about cgroup removing only after rmdir of cgroup
> >> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> index 97ce4f342fab..9d1df5d90eca 100644
> >> --- a/mm/vmscan.c
> >> +++ b/mm/vmscan.c
> >> @@ -165,6 +165,10 @@ static DECLARE_RWSEM(bitmap_rwsem);
> >> static int bitmap_id_start;
> >> static int bitmap_nr_ids;
> >> static struct shrinker **mcg_shrinkers;
> >> +struct shrinkers_map *__rcu root_shrinkers_map;
> >
> > Why do you need root_shrinkers_map? AFAIR the root memory cgroup doesn't
> > have kernel memory accounting enabled.
> But we can charge the corresponding lru and iterate it over global reclaim,
> don't we?

Yes, I guess you're right. But do we need to care about it? Would it be
OK if we iterated over all shrinkers for the root cgroup? Dunno...

Anyway, please try to handle the root cgroup consistently with other
cgroups. I mean, nothing like this root_shrinkers_map should exist.
It should be either a part of root_mem_cgroup or we should iterate over
all shrinkers for the root cgroup.

>
> struct list_lru_node {
> ...
> /* global list, used for the root cgroup in cgroup aware lrus */
> struct list_lru_one lru;
> ...
> };