Re: [RFC][PATCH 2/2] PM / Sleep: Introduce cooperative suspend/hibernate mode

From: Rafael J. Wysocki
Date: Mon Oct 17 2011 - 17:11:21 EST


On Monday, October 17, 2011, John Stultz wrote:
> On Sat, 2011-10-15 at 23:29 +0200, Rafael J. Wysocki wrote:
> > So I think (please correct me if I'm wrong) that you're worried about the
> > following situation:
> >
> > - The process opens /dev/sleepctl and sets the timeout
> > - It sets up a wake alarm to trigger at time T.
> > - It goes to sleep and sets it wakeup time to time T too, e.g. using select()
> > with a timeout.
> > - The system doesn't go to sleep in the meantime.
> > - The wake alarm triggers a bit earlier than the process is woken up and
> > system suspend is started in between of the two events.
> >
> > This race particular race is avoidable if the process sets its wakeup time
> > to T - \Delta T, where \Delta T is enough for the process to be scheduled
> > and run ioctl(sleepfd, SLEEPCTL_STAY_AWAKE). So the complete sequence may
> > look like this:
> >
> > - The process opens /dev/sleepctl as sleepfd1 and sets the timeout to 0.
> > - The process opens /dev/sleepctl as sleepfd2 and sets the timeout to T_2.
> > T_2 should be sufficient for the process to be able to call
> > ioctl(sleepfd1, SLEEPCT_STAY_AWAKE) when woken up.
> > - It sets up a wake alarm to trigger at time T.
> > - It goes to sleep and sets it wakeup time to time T - \Delta T, such that
> > \Delta T is sufficient for the process to call
> > ioctl(sleepfd2, SLEEPCT_STAY_AWAKE).
> >
> > Then, if system suspend happens before T - \Delta T, the process will be
> > woken up along with the wakealarm event at time T and it will be able to call
> > ioctl(sleepfd1, SLEEPCT_STAY_AWAKE) before T_2 expires. If system suspend
> > doesn't happen in that time frame, the process will wake up at T - \Delta T
> > and it will be able to call ioctl(sleepfd1, SLEEPCT_STAY_AWAKE) (even if
> > system suspend triggers after the process has been woken up and before it's
> > able to run the ioctl, it doesn't matter, because the wakealarm wakeup will
> > trigger the sleepfd2's STAY_AWAKE anyway).
>
> So, the alarmtimer code is a bit more simple then what you describe
> above (alarmtimers are just like regular posix timers, only enable an
> RTC wakeup for the soonest event when the system goes into suspend).
>
> However, such a dual-timer style behavior seems like it could work for
> timer driven wakeups (and have been suggested to me by others as well).
> Just to reiterate my understanding so that we're sure we're on the same
> wavelength:
>
> For any timer-style wakeup event, you set another non-wakeup timer for
> some small period of time before the wakeup timer. Then when the
> non-wakeup timer fires, the application inhibits suspend and waits for
> the wakeup timer.
>
> Thus if the system is supended, the system will stay asleep until the
> wakeup event, where we'll hold off suspend for a timeout length so the
> task can run. If the system is not suspended, the early timer inhibits
> suspend to block the possible race.
>
> So yes, while not a very elegant solution in my mind (as its still racy
> like any timeout based solution), it would seem to be workable in
> practice, assuming wide error margins are used as the kernel does not
> guarantee that timers will fire at a specific time (only after the
> requested time).
>
> And this again assumes we'll see no timing issues as a result of system
> load or realtime task processing.
>
>
> > Still, there appear to be similar races that aren't avoidable (for example,
> > if the time the wake alarm will trigger is not known to the process in
> > advance), so I have an idea how to address them. Namely, suppose we have
> > one more ioctl, SLEEPCTL_WAIT_EVENT, that's equivalent to a combination
> > of _RELAX, wait and _STAY_AWAKE such that the process will be sent a signal
> > (say SIGPWR) on the first wakeup event and it's _STAY_AWAKE will trigger
> > automatically.
>
> So actually first sentence above is key, so let me talk about that
> before I get into your new solution: As long as we know the timer is
> going to fire, we can set the pre-timer to inhibit suspend. But most
> wakeup events (network packets, keyboard presses, other buttons) are not
> timer based, and we don't know when they would arrive. Thus the same
> race could trigger between a wakeup-button press and a suspend call.
>
> 1) wakeup key press
> 2) suspend call
> 3) key-press task scheduled
>
> That's why I suggested adding the timeout on any wake event, instead of
> resume. This would block the suspend call inbetween the wake event and
> the application processing it.
>
> Really, the interaction is between the wakeup event and it being
> processed in userland. Resume, if it occurs, should really be
> transparent to that interaction. So that's why I think the
> resume-specific behavior in your original proposal doesn't make sense.
>
>
> > So in the scenarion above:
> >
> > - The process opens /dev/sleepctl, sets the timeout to 0 and calls
> > ioctl(sleepfd, SLEEPCTL_STAY_AWAKE).
> > - It sets up a wake alarm to trigger at time T.
> > - It runs ioctl(sleepctl, SLEEPCTL_WAIT_EVENT) which "relaxes" its sleepfd
> > and makes it go to sleep until the first wakeup event happens.
> > - The process' signal handler checks if the current time is >= T and makes
> > the process go to the previous step if not.
>
>
> So I'm not sure if I'm understanding your suggestion totally. Is it that
> when you call SLEEP_CTL_WAIT_EVENT, the ioctl sets SLEEP_CTL_RELAX, and
> then the ioctl call blocks?
>
> Then when the signal handler triggers, where exactly does the
> SLEEP_CTL_STAY_AWAKE call get made? Is it in the signal handler (after
> the task has been scheduled)? Or is it done by the kernel on task
> wakeup?
>
> If its the former, I don't see how it blocks the race.
>
> If its the latter, then it seems this proposal starts to somewhat
> approximate to my proposal (ie: kernel allows suspend on blocking on a
> specific device, then disables it on task wakeup).

It's the latter, but I think I have a better idea.

Please see my recent reply to Alan in this thread for details.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/