Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.
From: David P. Reed
Date: Tue Jan 01 2008 - 10:59:20 EST
Alan Cox wrote:
responds to reads differently than "unused" ports. In particular, an
inb takes 1/2 the elapsed time compared to a read to "known" unused port
0xed - 792 tsc ticks for port 80 compared to about 1450 tsc ticks for
port 0xed and other unused ports (tsc at 800 MHz).
Well at least we know where the port is now - thats too fast for an LPC
bus device, so it must be an SMI trap.
Only easy way to find out is to use the debugging event counters and see
how many instruction cycles are issued as part of the 0x80 port. If its
suprisingly high then you've got a firmware bug and can go spank HP.
Alan, thank you for the pointers. I have been doing variations on this
testing theme for a while - I get intrigued by a good debugging
challenge, and after all it's my machine...
Two relevant new data points, and then some more suggestions:
1. It appears to be a real port. SMI traps are not happening in the
normal outb to 80. Hundreds of them execute perfectly with the expected
instruction counts. If I can trace the particular event that creates
the hard freeze (getting really creative, here) and stop before the
freeze disables the entire computer, I will. That may be an SMI, or
perhaps any other kind of interrupt or exception. Maybe someone knows
how to safely trace through an impending SMI while doing printk's or
something?
2. It appears to be the standard POST diagnostic port. On a whim, I
disassembled my DSDT code, and studied it more closely. It turns out
that there are a bunch of "Store(..., DBUG)" instructions scattered
throughout, and when you look at what DBUG is defined as, it is defined
as an IO Port at IO address DBGP, which is a 1-byte value = 0x80. So
the ACPI BIOS thinks it has something to do with debugging. There's a
little strangeness here, however, because the value sent to the port
occasionally has something to do with arguments to the ACPI operations
relating to sleep and wakeup ... could just be that those arguments are
distinctive.
In thinking about this, I recognize a couple of things. ACPI is telling
us something when it declares a reference to port 80 in its code. It's
not telling us the function of this port on this machine, but it is
telling us that it is being used by the BIOS. This could be a reason
to put out a printk warning message... 'warning: port 80 is used by
ACPI BIOS - if you are experiencing problems, you might try an alternate
means of iodelay.'
Second, it seems likely that there are one of two possible reasons that
the port 80 writes cause hang/freezes:
1. buffer overflow in such a device.
2. there is some "meaning" to certain byte values being written (the
_PTS and _WAK use of arguments that come from callers to store into port
80 makes me suspicious.) That might mean that the freeze happens only
when certain values are written, or when they are written closely in
time to some other action - being used to communicate something to the
SMM code). If there is some race in when Linux's port 80 writes happen
that happen to change the meaning of a request to the hardware or to
SMM, then we could be rarely stepping on
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/