Re: [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states

From: Michal Hocko
Date: Fri Jun 14 2013 - 09:20:34 EST


On Wed 12-06-13 21:04:58, Tejun Heo wrote:
> A css (cgroup_subsys_state) is how each cgroup is represented to a
> controller. As such, it can be used in hot paths across the various
> subsystems different controllers are associated with.
>
> One of the common operations is reference counting, which up until now
> has been implemented using a global atomic counter and can have
> significant adverse impact on scalability. For example, css refcnt
> can be gotten and put multiple times by blkcg for each IO request.
> For highops configurations which try to do as much per-cpu as
> possible, the global frequent refcnting can be very expensive.
>
> In general, given the various and hugely diverse paths css's end up
> being used from, we need to make it cheap and highly scalable. In its
> usage, css refcnting isn't very different from module refcnting.
>
> This patch converts css refcnting to use the recently added
> percpu_ref.

I have no objections to change css reference counting scheme if the
guarantees we used to have are still valid. I am just missing some
comparisons. Do you have any numbers that would show benefits clearly?

You are mentioning that especially controllers that are strongly per-cpu
oriented will see the biggest improvements. What about others?
A single atomic_add resp. atomic_dec_return is much less heavy than the
new ref counting. Is it possible that those could regress? Or the
differences would be only within the noise?

I do realize that it is not possible to test all controllers but I
would be interested to see at least that those for which it really
matters get a nice boost. Memcg uses css with caution although there are
places which are in really hot paths (e.g. charging) so any improvement
would be really welcome.

Sorry, if this information has been posted along with the series. I was
CCed only on this one and didn't get to look at the rest yet (apart from
"percpu: implement generic percpu refcounting" in your
review-css-percpu-ref branch).
[...]
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/