Re: using mce_inject I get: RIP 10:<ffffffffa012c909> {ttm_bo_unref+0xf/0x45[ttm]}

From: Justin P. Mattock
Date: Sun Aug 21 2011 - 19:08:46 EST


On 08/21/2011 03:16 PM, Andi Kleen wrote:
On Sat, Aug 20, 2011 at 07:31:06PM -0700, Justin P. Mattock wrote:
not sure if I am running mce_test correctly, but during its routine of
testing things I do get a pause with everything, then the below shows up
in dmesg..:

The message is expected, but there should be no noticeable
pause.

well looking and doing more of these tests I am getting a noticeable pause, lasts for about 2-3 seconds then everything goes back to normal.
(all of these are whenver the tests do a timout test).


-Andi


http://fpaste.org/kMRd/


[ 1810.670434] Triggering MCE exception on CPU 1
[ 1810.670462] [Hardware Error]: CPU 1: Machine Check Exception: 6 Bank
4: b300000000000000
[ 1810.670467] [Hardware Error]: RIP 73:<0000000012343434>
[ 1810.670470] [Hardware Error]: TSC 38d1002c216
[ 1810.670474] [Hardware Error]: PROCESSOR 0:6f6 TIME 1313892803 SOCKET
0 APIC 1
[ 1810.670477] [Hardware Error]: Run the above through 'mcelog --ascii'
[ 1810.670481] [Hardware Error]: Machine check: Processor context corrupt
[ 1810.670483] [Hardware Error]: Fake kernel panic: Fatal Machine check
[ 1810.670495] MCE exception done on CPU 1
[ 1819.064721] Triggering MCE exception on CPU 1

seems light of a pause, then everything resumes properly(music, etc..).
Is this something that needs attention, or are these tests as extreme as
can be, and should simply be ignored?
(Note: if there is a mce list somewhere let me know so I direct this to
the proper people)

Justin P. Mattock



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/