Re: What is the right practice to get new code upstream( was Fwd:[patch] a simple hardware detector for latency as well as throughput ver.0.1.0)

From: Peter Zijlstra
Date: Thu Jun 14 2012 - 06:05:18 EST


On Tue, 2012-06-12 at 20:57 +0800, Luming Yu wrote:
> The Goal is to test out hardware/BIOS caused
> problems in 2 minutes. To me, capable of measuring the intrinsic of
> hardware on which we build our software is always a better idea than
> blindly looking for data from any documents. In current version of the
> tool, we have a basic sampling facility and TSC test ready for x86. I
> plan to add more test into this tool to enrich our tool set in Linux.
>
>
There's SMI damage around on much longer periods than 2 minutes.

Also, you can't really do stop_machine for 2 minutes and expect the
system to survive.

Furthermore, I think esp. on more recent chips there's better ways of
doing it.

For Intel there's a IA32_DEBUGCTL.FREEZE_WHILE_SMM_EN [bit 14], if you
program a PMU event that ticks at the same rate as the TSC and enable
the FREEZE_WHILE_SMM stuff, any drift observed between that and the TSC
is time lost to SMM. It also has MSR_SMI_COUNT [MSR 34H] which counts
the number of SMIs.

For AMD there's only event 02Bh, which is SMIs Received. I'm not sure it
has anything like the FREEZE or if the event is modifyable to count the
cycles in SMI.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/