Re: [PATCH v2] freezer,sched: Use saved_state to reduce some spurious wakeups

From: Peter Zijlstra
Date: Thu Sep 07 2023 - 11:31:45 EST


On Mon, Sep 04, 2023 at 08:59:03PM -0700, Elliot Berman wrote:
>
>
> On 9/4/2023 2:23 PM, Peter Zijlstra wrote:
> > On Wed, Aug 30, 2023 at 10:42:39AM -0700, Elliot Berman wrote:
> >
> > > Avoid the spurious wakeups by saving the state of TASK_FREEZABLE tasks.
> > > If the task was running before entering TASK_FROZEN state
> > > (__refrigerator()) or if the task received a wake up for the saved
> > > state, then the task is woken on thaw. saved_state from PREEMPT_RT locks
> > > can be re-used because freezer would not stomp on the rtlock wait flow:
> > > TASK_RTLOCK_WAIT isn't considered freezable.
> >
> > You don't actually assert that anywhere I think, so the moment someone
> > makes that happen you crash and burn.
> >
>
> I can certainly add an assertion on the freezer side.

I think the assertion we have in ttwu_state_match() might be sufficient.

> > Also:
> >
> > > -#ifdef CONFIG_PREEMPT_RT
> > > +#if IS_ENABLED(CONFIG_PREEMPT_RT) || IS_ENABLED(CONFIG_FREEZER)
> >
> > That makes wakeup more horrible for everyone :/
>
> I don't think the hot wakeup path is significantly impacted because the
> added checks come after the hot path is already not taken.

Perhaps we should start off by doing the below, instead of making it
more complicated instead. I suppose you're right about the overhead, but
run a hackbench just to make sure or something.


diff --git a/include/linux/sched.h b/include/linux/sched.h
index 77f01ac385f7..649ddb9adf0d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -749,11 +749,7 @@ struct task_struct {
struct thread_info thread_info;
#endif
unsigned int __state;
-
-#ifdef CONFIG_PREEMPT_RT
- /* saved state for "spinlock sleepers" */
unsigned int saved_state;
-#endif

/*
* This begins the randomizable portion of task_struct. Only
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2299a5cfbfb9..b566821614e1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2239,31 +2239,21 @@ int __task_state_match(struct task_struct *p, unsigned int state)
if (READ_ONCE(p->__state) & state)
return 1;

-#ifdef CONFIG_PREEMPT_RT
if (READ_ONCE(p->saved_state) & state)
return -1;
-#endif
+
return 0;
}

static __always_inline
int task_state_match(struct task_struct *p, unsigned int state)
{
-#ifdef CONFIG_PREEMPT_RT
- int match;
-
/*
* Serialize against current_save_and_set_rtlock_wait_state() and
* current_restore_rtlock_saved_state().
*/
- raw_spin_lock_irq(&p->pi_lock);
- match = __task_state_match(p, state);
- raw_spin_unlock_irq(&p->pi_lock);
-
- return match;
-#else
+ guard(spin_lock_irq)(&p->pi_lock);
return __task_state_match(p, state);
-#endif
}

/*
@@ -4056,7 +4046,6 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)

*success = !!(match = __task_state_match(p, state));

-#ifdef CONFIG_PREEMPT_RT
/*
* Saved state preserves the task state across blocking on
* an RT lock. If the state matches, set p::saved_state to
@@ -4072,7 +4061,7 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
*/
if (match < 0)
p->saved_state = TASK_RUNNING;
-#endif
+
return match > 0;
}