Re: [BUG] x86: reboot doesn't reboot

From: Tobias Klausmann
Date: Fri Apr 04 2014 - 12:04:00 EST



On 04.04.2014 17:45, Matthew Garrett wrote:
On Fri, Apr 04, 2014 at 08:38:42AM -0700, Linus Torvalds wrote:
On Fri, Apr 4, 2014 at 8:12 AM, Matthew Garrett <mjg59@xxxxxxxxxxxxx> wrote:
Production hardware should never require CF9.
That's total BS.

The fact is, we may be doing something wrong, but ACPI fails on a
*lot* of systems. A huge swath of Dell machines in particular for some
reason (laptops, desktops, _and_ now there's tablet reports).
Which is almost certainly because the other reboot methods are trapping
into SMI and hitting some hardware that we've left in a different state
to Windows. CF9 may work around that, but the actual fix is to figure
out why the firmware is wedging and fix it. Otherwise we're going to
spend the rest of our lives maintaining a giant DMI list that's still
going to be missing entries and users are going to be sad.

The keyboard controller is sadly unreliable too, although I really
don't understand why. Even when a legacy keyboard controller exists
(which isn't as universal as you'd think, even though the *hardware*
is pretty much guaranteed to be there in the chipset, it can be
disabled) there seem to be machines where the reset line isn't hooked
up. Don't ask me why. Same goes for the triple fault failure case.
See: SMI. Or in the triple fault case, because there's some early init
code that has the same issue. As far as I can tell Windows never triple
faults, so again I think this is our fault at some level.

It would be interesting if somebody can figure out *exactly* what
Windows does, because the fact that a lot of Dell machines need quirks
almost certainly means that it's _us_ doing something wrong. Dell
doesn't generally do lots of fancy odd things. I pretty much guarantee
it's because we've done something odd that Windows doesn't do.
Windows hits the keyboard controller and then tries the ACPI vector. It
then sleeps for a short period, then tries the keyboard controller again
and the ACPI vector again. This means that systems which put cf9 in the
ACPI vector tend to work because of the second write, which is obviously
not what the spec envisaged but here we are. The only time it hits CF9
is when the ACPI tables tell it to.


Hi,
sorry to get into the discussion at this random point, not at the starting point, but with the latest Linus Tree my system needs several minutes to reboot instead of some seconds, so this may be related. If I can help in any way, let me konw!

Thanks,
Tobias Klausmann
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/