Re: [Suspend-devel] [BUG] 3.7-rc regression bisected: s2disk fails to resume image: Processes could not be frozen, cannot continue resuming

From: Rafael J. Wysocki
Date: Thu Oct 17 2013 - 17:23:24 EST


Sorry for the huge delay.

On Tuesday, September 24, 2013 02:21:11 AM Pavel Machek wrote:
> Hi!
>
> > > And from suspend_ioctls.h:
> > > #define SNAPSHOT_IOC_MAGIC '3'
> > > #define SNAPSHOT_FREEZE _IO(SNAPSHOT_IOC_MAGIC, 1)
> > >
> > > My mistake, should be '3' instead of 3.
> >
> > OK... The thing to test, then, is what does __usermodehelper_disable()
> > return to freeze_processes(). If that's where this -EAGAIN comes from,
> > we at least have a plausible theory re what's going on.
> >
> > freeze_processes() uses __usermodehelper_disable() to stop any new userland
> > processes spawned by UMH (modprobe, etc.) and waits for ones it might be
> > waiting for to complete. Then it does try_to_freeze_tasks(), which
> > freezes remaining userland, carefully skipping the current thread.
> > However, it misses the possibility that current thread might have been
> > spawned by something that had been launched by UMH, with UMH waiting
> > for it. Which is the case of everything spawned by linuxrc.
> >
> > I'd try something like diff below, but I'm *NOT* familiar with swsusp at
> > all; it's not for mainline until ACKed by swsusp folks.
> >
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index fb32636..d968882 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -571,7 +571,8 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
> > DECLARE_COMPLETION_ONSTACK(done);
> > int retval = 0;
> >
> > - helper_lock();
> > + if (!(current->flags & PF_FREEZER_SKIP))
> > + helper_lock();
> > if (!khelper_wq || usermodehelper_disabled) {
> > retval = -EBUSY;
> > goto out;
> > @@ -611,7 +612,8 @@ wait_done:
> > out:
> > call_usermodehelper_freeinfo(sub_info);
> > unlock:
> > - helper_unlock();
> > + if (!(current->flags & PF_FREEZER_SKIP))
> > + helper_unlock();
> > return retval;
> > }
> > EXPORT_SYMBOL(call_usermodehelper_exec);
>
> PF_FREEZER_SKIP flag is manipulated at about 1000 places, so I'm not
> sure this will nest correctly.

This is not exactly correct unless 1000 is about 50. And none of them leads to
call_usermodehelper_exec() as far as I can say.

> They seem to be in form of
>
> |= FREEZER_SKIP
> schedule()
> &= ~FREEZER_SKIP
>
> so this should be safe, but...

I think the patch is correct, so

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

Thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/