Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]

From: Feng Tang
Date: Tue Nov 08 2016 - 22:47:24 EST


On Wed, Nov 02, 2016 at 04:47:37AM +0800, Ville SyrjÃlà wrote:
> On Fri, Oct 28, 2016 at 08:58:41PM +0200, Thomas Gleixner wrote:
> > On Fri, 28 Oct 2016, Ville SyrjÃlà wrote:
> > > On Thu, Oct 27, 2016 at 10:41:18PM +0200, Thomas Gleixner wrote:
> > > > On Thu, 27 Oct 2016, Ville SyrjÃlà wrote:
> > > > > On Thu, Oct 27, 2016 at 09:25:05PM +0200, Thomas Gleixner wrote:
> > > > > > So it would be interesting whether that hunk in resume_broadcast() is
> > > > > > sufficient.
> > > > >
> > > > > So far it looks like the answer is yes.
> > > > >
> > > > > Looks to be about 5 seconds slower than acpi-idle in resuming, but
> > > > > I suppose that's not all that surprising ;)
> > > >
> > > > Well, set it to 1msec then. If that works reliably then we really can do
> > > > that unconditionally. There is no harm in firing a useless timer during
> > > > resume once.
> > >
> > > I narrowed down the required timeout, and looks like 25ms is the
> > > minimum that works. With 24ms I already started to have failures. So
> > > maybe just bump it up by an order of magnitude to 250ms for some
> > > safety margin?
>
> I left the thing running for the weekend and it failed 26 out of 16057
> times with the 25ms timeout. Looks like it takes ~5 minutes to resume
> when it fails, but eventually it does come back.
>

Just came back from a travel. Yes, the 5 minutes delay may be due to the
expiration of the HPET timer, counting from 0 to 0xffffffff for a 13M
frequencey HPET takes about 300 seconds. After resume, it seems nobody
arms it so my old patch forces to arm one event.

Thanks,
Feng