Re: [PATCH v2] freezer,sched: Use saved_state to reduce some spurious wakeups

From: Elliot Berman
Date: Fri Sep 08 2023 - 16:08:29 EST




On 9/7/2023 2:46 AM, Peter Zijlstra wrote:
> On Mon, Sep 04, 2023 at 08:59:03PM -0700, Elliot Berman wrote:
>>
>>
>> On 9/4/2023 2:23 PM, Peter Zijlstra wrote:
>>> On Wed, Aug 30, 2023 at 10:42:39AM -0700, Elliot Berman wrote:
>>>
>>>> Avoid the spurious wakeups by saving the state of TASK_FREEZABLE tasks.
>>>> If the task was running before entering TASK_FROZEN state
>>>> (__refrigerator()) or if the task received a wake up for the saved
>>>> state, then the task is woken on thaw. saved_state from PREEMPT_RT locks
>>>> can be re-used because freezer would not stomp on the rtlock wait flow:
>>>> TASK_RTLOCK_WAIT isn't considered freezable.
>>>
>>> You don't actually assert that anywhere I think, so the moment someone
>>> makes that happen you crash and burn.
>>>
>>
>> I can certainly add an assertion on the freezer side.
>
> I think the assertion we have in ttwu_state_match() might be sufficient.
>

That assertion checks that you only try to wake up with only
TASK_RTLOCK_WAIT and no other bits. I think it is probably good to also
have assertions that check that TASK_RTLOCK_WAIT and TASK_FROZEN are
exclusive bits and. I can add these assertions (a separate patch?), but
I think those checks would impact the hot path to do the extra tests.

>>> Also:
>>>
>>>> -#ifdef CONFIG_PREEMPT_RT
>>>> +#if IS_ENABLED(CONFIG_PREEMPT_RT) || IS_ENABLED(CONFIG_FREEZER)
>>>
>>> That makes wakeup more horrible for everyone :/
>>
>> I don't think the hot wakeup path is significantly impacted because the
>> added checks come after the hot path is already not taken.
>
> Perhaps we should start off by doing the below, instead of making it
> more complicated instead. I suppose you're right about the overhead, but
> run a hackbench just to make sure or something.
>

I ran perf bench sched message -g 40 -l 40 with the v3 patch [1]. After 60
iterations each, I don't see a significant difference on my arm64 platform:
both samples ~normal and ~eq variance w/t-test p-value: 0.79.

We also ran typical high level benchmarks for our SoCs (antutu,
geekbench, et. al) and didn't see any regressions there.

[1]: https://lore.kernel.org/all/20230908-avoid-spurious-freezer-wakeups-v3-1-d49821fda04d@xxxxxxxxxxx/

Thanks,
Elliot