Re: PROBLEM: mce: [Hardware Error] from dmesg -l emerg

From: Jeffrin Thalakkottoor
Date: Mon May 21 2018 - 15:16:47 EST


> (*) Can you send a snip from the raw dmesg output that starts
> a couple of lines before:
>
>
> ... [Hardware Error]: CPU 0: Machine Check: 0 Bank: 5 ...
>
> and continues a couple of lines past
>
> ... [Hardware Error]: PROCESSOR 0:306d4 ...
>
> and I'll take a look at why mcelog choked.



------------------------------------------------------------------------------------------------------------------------------------->
$sudo dmesg -r | grep -B 30 "Bank"
x2apic: IRQ remapping doesn't support X2APIC mode
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
tsc: Fast TSC calibration using PIT
tsc: Detected 1895.567 MHz processor
clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles:
0x36a5a03c965, max_idle_ns: 881590412318 ns
Calibrating delay loop (skipped), value calculated using timer
frequency.. 3791.13 BogoMIPS (lpj=7582268)
pid_max: default: 32768 minimum: 301
Security Framework initialized
Yama: becoming mindful.
AppArmor: AppArmor initialized
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
mce: CPU supports 7 MCE banks
CPU0: Thermal monitoring enabled (TM1)
process: using mwait in idle threads
Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
Spectre V2 : Mitigation: Full generic retpoline
Spectre V2 : Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier
Spectre V2 : Enabling Restricted Speculation for firmware calls
Freeing SMP alternatives memory: 32K
TSC deadline timer enabled
smpboot: CPU0: Intel(R) Pentium(R) CPU 3825U @ 1.90GHz (family: 0x6,
model: 0x3d, stepping: 0x4)
mce: [Hardware Error]: Machine check events logged
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: ee0000000040110b
$
---------------------------------------------------------------------------------------------------------------------------------------------->



$sudo dmesg -r | grep -A 30 "0:306d4"
mce: [Hardware Error]: PROCESSOR 0:306d4 TIME 1526932210 SOCKET 0 APIC
0 microcode 2a
Performance Events: PEBS fmt2+, Broadwell events, 16-deep LBR,
full-width counters, Intel PMU driver.
... version: 3
... bit width: 48
... generic registers: 4
... value mask: 0000ffffffffffff
... max period: 00007fffffffffff
... fixed-purpose events: 3
... event mask: 000000070000000f
Hierarchical SRCU implementation.
NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
smp: Bringing up secondary CPUs ...
x86: Booting SMP configuration:
.... node #0, CPUs: #1 #2 #3
smp: Brought up 1 node, 4 CPUs
smpboot: Max logical packages: 1
smpboot: Total of 4 processors activated (15164.53 BogoMIPS)
devtmpfs: initialized
x86/mm: Memory block size: 128MB
PM: Registering ACPI NVS region [mem 0x9cc8e000-0x9cf8dfff] (3145728 bytes)
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
max_idle_ns: 7645041785100000 ns
futex hash table entries: 1024 (order: 4, 65536 bytes)
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(1526932210.048:1): state=initialized
audit_enabled=0 res=1
cpuidle: using governor ladder
cpuidle: using governor menu
Simple Boot Flag at 0x44 set to 0x1
ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
ACPI: bus type PCI registered
$
------------------------------------------------------------------------------------------------------------------------>


--
software engineer
rajagiri school of engineering and technology