Re: [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usablefor other purposes

From: Christoph Lameter
Date: Tue Jun 28 2005 - 02:10:21 EST


On Tue, 28 Jun 2005, Kirill Korotaev wrote:

> > -static inline int freezing(struct task_struct *p)
> > -{
> > - return p->flags & PF_FREEZE;
> > -}
> > +#if defined(CONFIG_PM) || defined(CONFIG_MIGRATE)
> <<<< why not to make a single option CONFIG_REFRIGERATOR? It looks to be a
> more robust way, since there are multiple users of it.

Yes. That may be better. We can do that once the migration code is
finished and when we know what kind of CONFIG_XXX the migration code
really needs.

> > +#ifdef CONFIG_PM
> <<<< is it intentionaly? or you just lost CONFIG_MIGRATE?

It is intentional. freeze_processes and thaw_processes are only needed
for suspend. One only needs to freeze a couple of processes for process
migration.

> <<<< I still think this refrigerator is racy with freeze_processes():
> <<<< scenarios:
> <<<< scenario 1
> <<<<
> task1 -> freeze_processes(): taskXXX ->refrigerator()
> checked (task->flags & PF_FROZEN) == 0 cur->flags |= PF_FROZEN
> clear TIF_FREEZE
> <sleep on thaw>
> set TIF_FREEZING
> clear PF_FROZEN
>
> <<<< so the task awakes with TIF_FREEZE flag set!!!

Hmm... If we wait to clear both flags until after the completion
notification then we do not have the race right? But then we need to move
the signal recalc since it tests for TIF_FREEZE too.

> <<<< scenario 2
> <<<< look at error path in freeze_processes (on timeout), it is broken as
> well. You need to wakeup tasks there...

Ok. How about this additional patch? This still requires that process
freezing does not immediately occurr again after the completion
handler. All of this is iffy due to not having a real lock protecting all
these values and we may still need to add some barriers.

Index: linux-2.6.12/kernel/power/process.c
===================================================================
--- linux-2.6.12.orig/kernel/power/process.c 2005-06-28 06:34:52.000000000 +0000
+++ linux-2.6.12/kernel/power/process.c 2005-06-28 06:40:28.000000000 +0000
@@ -47,12 +47,13 @@ int freeze_processes(void)
unsigned long flags;
if (!freezeable(p))
continue;
- if ((p->flags & PF_FROZEN) ||
- (p->state == TASK_TRACED) ||
+ if ((p->state == TASK_TRACED) ||
(p->state == TASK_STOPPED))
continue;

set_thread_flag(TIF_FREEZE);
+ if (p->flags & PF_FROZEN)
+ continue;
spin_lock_irqsave(&p->sighand->siglock, flags);
signal_wake_up(p, 0);
spin_unlock_irqrestore(&p->sighand->siglock, flags);
@@ -63,6 +64,8 @@ int freeze_processes(void)
if (time_after(jiffies, start_time + TIMEOUT)) {
printk( "\n" );
printk(KERN_ERR " stopping tasks failed (%d tasks remaining)\n", todo );
+ complete_all(&thaw);
+ up(&freezer_sem);
return todo;
}
} while(todo);
Index: linux-2.6.12/kernel/sched.c
===================================================================
--- linux-2.6.12.orig/kernel/sched.c 2005-06-28 06:34:52.000000000 +0000
+++ linux-2.6.12/kernel/sched.c 2005-06-28 06:37:36.000000000 +0000
@@ -5210,13 +5210,13 @@ DECLARE_COMPLETION(thaw);
void refrigerator(void)
{
current->flags |= PF_FROZEN;
+ wait_for_completion(&thaw);
clear_thread_flag(TIF_FREEZE);
+ current->flags &= ~PF_FROZEN;
/* A fake signal 0 may have been sent. Recalculate sigpending */
spin_lock_irq(&current->sighand->siglock);
recalc_sigpending();
spin_unlock_irq(&current->sighand->siglock);
- wait_for_completion(&thaw);
- current->flags &= ~PF_FROZEN;
}
EXPORT_SYMBOL(refrigerator);
#endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/