Re: [Suspend-devel] [BUG] 3.7-rc regression bisected: s2disk fails to resume image: Processes could not be frozen, cannot continue resuming

From: Andrew Savchenko
Date: Sun May 22 2016 - 04:47:52 EST


Hi,

On Thu, 17 Oct 2013 23:35:12 +0200 Rafael J. Wysocki wrote:
> Sorry for the huge delay.
>
> On Tuesday, September 24, 2013 02:21:11 AM Pavel Machek wrote:
> > Hi!
> >
> > > > And from suspend_ioctls.h:
> > > > #define SNAPSHOT_IOC_MAGIC '3'
> > > > #define SNAPSHOT_FREEZE _IO(SNAPSHOT_IOC_MAGIC, 1)
> > > >
> > > > My mistake, should be '3' instead of 3.
> > >
> > > OK... The thing to test, then, is what does __usermodehelper_disable()
> > > return to freeze_processes(). If that's where this -EAGAIN comes from,
> > > we at least have a plausible theory re what's going on.
> > >
> > > freeze_processes() uses __usermodehelper_disable() to stop any new userland
> > > processes spawned by UMH (modprobe, etc.) and waits for ones it might be
> > > waiting for to complete. Then it does try_to_freeze_tasks(), which
> > > freezes remaining userland, carefully skipping the current thread.
> > > However, it misses the possibility that current thread might have been
> > > spawned by something that had been launched by UMH, with UMH waiting
> > > for it. Which is the case of everything spawned by linuxrc.
> > >
> > > I'd try something like diff below, but I'm *NOT* familiar with swsusp at
> > > all; it's not for mainline until ACKed by swsusp folks.
> > >
> > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > index fb32636..d968882 100644
> > > --- a/kernel/kmod.c
> > > +++ b/kernel/kmod.c
> > > @@ -571,7 +571,8 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
> > > DECLARE_COMPLETION_ONSTACK(done);
> > > int retval = 0;
> > >
> > > - helper_lock();
> > > + if (!(current->flags & PF_FREEZER_SKIP))
> > > + helper_lock();
> > > if (!khelper_wq || usermodehelper_disabled) {
> > > retval = -EBUSY;
> > > goto out;
> > > @@ -611,7 +612,8 @@ wait_done:
> > > out:
> > > call_usermodehelper_freeinfo(sub_info);
> > > unlock:
> > > - helper_unlock();
> > > + if (!(current->flags & PF_FREEZER_SKIP))
> > > + helper_unlock();
> > > return retval;
> > > }
> > > EXPORT_SYMBOL(call_usermodehelper_exec);
> >
> > PF_FREEZER_SKIP flag is manipulated at about 1000 places, so I'm not
> > sure this will nest correctly.
>
> This is not exactly correct unless 1000 is about 50. And none of them leads to
> call_usermodehelper_exec() as far as I can say.
>
> > They seem to be in form of
> >
> > |= FREEZER_SKIP
> > schedule()
> > &= ~FREEZER_SKIP
> >
> > so this should be safe, but...
>
> I think the patch is correct, so
>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Thanks!

The problem is still here with 4.6.0 kernel. Patch is updated to
reflect code base changes while retaining original logics. Works
fine for me here (4.6.0, x86, Atom N270, EeePC 1000H).

Best regards,
Andrew Savchenko
--- linux-4.6.0/kernel/kmod.c.orig 2016-05-16 01:43:13.000000000 +0300
+++ linux-4.6.0/kernel/kmod.c 2016-05-22 09:49:12.654159691 +0300
@@ -561,7 +561,8 @@
call_usermodehelper_freeinfo(sub_info);
return -EINVAL;
}
- helper_lock();
+ if (!(current->flags & PF_FREEZER_SKIP))
+ helper_lock();
if (usermodehelper_disabled) {
retval = -EBUSY;
goto out;
@@ -595,7 +596,8 @@
out:
call_usermodehelper_freeinfo(sub_info);
unlock:
- helper_unlock();
+ if (!(current->flags & PF_FREEZER_SKIP))
+ helper_unlock();
return retval;
}
EXPORT_SYMBOL(call_usermodehelper_exec);

Attachment: pgpPWi2hsu0Bo.pgp
Description: PGP signature