Re: [PATCH 1/2] mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()

From: Michal Hocko
Date: Mon Nov 18 2019 - 04:43:38 EST


On Fri 15-11-19 18:07:34, Roman Gushchin wrote:
> On Fri, Nov 15, 2019 at 06:47:21PM +0100, Michal Hocko wrote:
> > On Fri 15-11-19 18:40:31, Michal Hocko wrote:
> > > On Thu 14-11-19 11:37:36, Tejun Heo wrote:
> > > > Hello,
> > > >
> > > > On Thu, Nov 14, 2019 at 08:33:40PM +0100, Michal Hocko wrote:
> > > > > > It is useful for controlling admissions of new userspace visible uses
> > > > > > - e.g. a tracepoint shouldn't be allowed to be attached to a cgroup
> > > > > > which has already been deleted.
> > > > >
> > > > > I am not sure I understand. Roman says that the cgroup can get offline
> > > > > right after the function returns. How is "already deleted" different
> > > > > from "just deleted"? I thought that the state is preserved at least
> > > > > while the rcu lock is held but my memory is dim here.
> > > >
> > > > It's the same difference as between "opening a file and deleting it"
> > > > and "deleting a file and opening it".
> > >
> > > I am sorry but I do not follow. How can css_tryget_online provide the
> > > same semantic when the css can go offline right after the tryget call
> > > returns so it is effectivelly undistinguishable from the case when the
> > > css was already online before the call was made.
> >
> > s@online@offline@
> >
> > And reading after myself it turned out to sound differently than I
> > meant. What I wanted to say really is, what is the difference that
> > css_tryget_online really guarantee when the css might go offline right
> > after the call suceeds so more specifically what is the difference
> > between
> > if (css_tryget()) {
> > if (online)
> > DO_SOMETHING
> > }
> > and
> > if (css_tryget_online()) {
> > DO_SOMETHING
> > }
> >
> > both of them are racy and do not provide any guarantee wrt. online
> > state.
>
> Let me step back a little bit.
>
> I think, we all agree that css_tryget_online() has a weird semantics,
> in most cases is used only due to historical reasons and clearly asks
> for a cleanup. So I suggest to stop arguing about it and wait for the
> cleanup patchset. Then we can discuss each remaining use case in details,
> if there will be any.

Yes I am all in favor of the clean up patches as well as getting down
to the bottom of the underlying issue (race). Andrew has already sent
these two patches to Linus, unfortunatelly, even though the changelog
is slightly misleading (btw 18fa84a2db0e has the similar incorrect
reasoning).
--
Michal Hocko
SUSE Labs