Re: Linux 4.15-rc2: Regression in resume from ACPI S3

From: Thomas Gleixner
Date: Wed Dec 13 2017 - 13:19:30 EST


On Wed, 13 Dec 2017, Linus Torvalds wrote:

> On Wed, Dec 13, 2017 at 8:41 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > Definitely. That was fragile forever but puzzles me is that I can't figure
> > out what now causes that spurious interrupt to surface out of the blue.
>
> Perhaps just timing?

That's what I'm trying to figure out right now, because that is the only
sensible explanation left. The whole machinery of suspend is exactly the
same with and without the vector changes. I instrumented all functions
involved and the picture is the same. I even do not see any fundamental
timing differences where one would say: That's it.

What puzzles me even more is that in the range of commits I'm fiddling with
there is no other change than the vector management stuff and the point
where it breaks makes no sense at all. The point Maarten bisected it to
works nicely here, so that might just point to a very subtle timing issue.

> How hard would it be to change the ordering to just redirect irqs first?

The whole interrupt redirection happens when the non boot CPUs are brought
down, which is the very last step before the actual suspend happens.

We could probably do that earlier, but that's something Rafael needs to
answer ultimately.

Thanks,

tglx