Re: [PATCH v2 03/10] freezer: add new freezable helpers usingfreezer_do_not_count()

From: Tejun Heo
Date: Fri May 03 2013 - 00:09:44 EST


Hello,

On Thu, May 02, 2013 at 07:41:39PM -0700, Colin Cross wrote:
> On Thu, May 2, 2013 at 7:16 PM, Colin Cross <ccross@xxxxxxxxxxx> wrote:
> > This sounds the same as what ended up getting reverted in
> > https://lkml.org/lkml/2013/3/4/221
> > I can add the WARN_ON_ONCE to all my new calls, and leave them out of
> > existing calls, but that seems a little odd, and will be redundant if
> > the lockdep call in try_to_freeze goes back in in 3.11. Do you still
> > want it in the new apis?
...
> I could also put the lockdep check that was reveted back into
> try_to_freeze(), and add a freezable_schedule_unsafe() that skips it
> for use in the known-unsafe users in nfs, with a big comment not to
> add new users of it.

Oh yeah, that sounds like a good idea, and as for the non-ugly
variants, at least in the mid/long term, I think it'd be best to add
the lockdep annotation to try_to_freeze() with
__try_to_freeze_unsafe_youre_gonna_burn_in_hell_if_you_use_this() for
the existing users which should be gradually converted, but if that's
too burdensome, adding warnings to the wrappers should do for now, I
guess.

And I *hope* the lockdep annotation is stricter than what was added
before. I think it better be "no lock ever should be held at this
point" rather than "consider this a big lock". I'm not sure how much
consensus we have on this but I'd like to, in the longer term, merge
freezer into job control stop so that frozen tasks don't appear as
being in uninterruptible sleep. Currently, the frozen tasks are
essentially in UNINTERRUPTIBLE sleep which is okay for suspend but for
cgroup freezer, it sucks. You end up with tasks which you can't even
kill.

Combined with the locking problems, I was planning to update the
freezer such that the frozen state is implemented as a form of jobctl
stop, so that things like ptrace / kill -9 could work on them and we
also have the clear definition of the frozen state rather than the
current "it may get stuck somewhere in the kernel".

But that conflicts with what you're doing here which seems pretty
useful, so, to satisfy both goals, when somebody needs to put a
pseudo-frozen task into the actual frozen jobctl stop, those spots
which are currently using try_to_stop() would have to return an error,
most likely -EINTR with TIF_SIGPENDING set, and the control should
return towards userland so that signal handling path can be invoked.
ie. It should be possible to steer the tasks which are considered
frozen but not in the frozen jobctl stop into the jobctl stop without
any side effect. To do that, those spots basically have to be pretty
close to the userland boundary where it can easily leave the kernel
with -EINTR and AFAICS all the spots that you converted are like that
(which I think is natural). While not holding any locks doesn't
guarantee that, I think there'd be a fairly high correlation at least
and it'd be able to drive people towards finding out what's going on.

So, that's my agenda. Not sure how many actually agree with it.
Rafael, Oleg, Jeff?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/