unknown NMI on AMD Rome

From: Jiri Olsa
Date: Tue Mar 16 2021 - 11:46:23 EST


hi,
when running 'perf top' on AMD Rome (/proc/cpuinfo below)
with fedora 33 kernel 5.10.22-200.fc33.x86_64

we got unknown NMI messages:

[ 226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90.
[ 226.700162] Do you have a strange power saving mode enabled?
[ 226.700163] Dazed and confused, but trying to continue
[ 226.769565] Uhhuh. NMI received for unknown reason 3d on CPU 84.
[ 226.769566] Do you have a strange power saving mode enabled?
[ 226.769567] Dazed and confused, but trying to continue
[ 226.769771] Uhhuh. NMI received for unknown reason 2d on CPU 24.
[ 226.769773] Do you have a strange power saving mode enabled?
[ 226.769774] Dazed and confused, but trying to continue
[ 226.812844] Uhhuh. NMI received for unknown reason 2d on CPU 23.
[ 226.812846] Do you have a strange power saving mode enabled?
[ 226.812847] Dazed and confused, but trying to continue
[ 226.893783] Uhhuh. NMI received for unknown reason 2d on CPU 27.
[ 226.893785] Do you have a strange power saving mode enabled?
[ 226.893786] Dazed and confused, but trying to continue
[ 226.900139] Uhhuh. NMI received for unknown reason 2d on CPU 40.
[ 226.900141] Do you have a strange power saving mode enabled?
[ 226.900143] Dazed and confused, but trying to continue
[ 226.908763] Uhhuh. NMI received for unknown reason 3d on CPU 120.
[ 226.908765] Do you have a strange power saving mode enabled?
[ 226.908766] Dazed and confused, but trying to continue
[ 227.751296] Uhhuh. NMI received for unknown reason 2d on CPU 83.
[ 227.751298] Do you have a strange power saving mode enabled?
[ 227.751299] Dazed and confused, but trying to continue
[ 227.752937] Uhhuh. NMI received for unknown reason 3d on CPU 23.

also when discussing ths with Borislav, he managed to reproduce easily
on his AMD Rome machine

any idea?

thanks,
jirka


---
processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7742 64-Core Processor
stepping : 0
microcode : 0x8301034
cpu MHz : 1497.024
cache size : 512 KB
physical id : 0
siblings : 64
core id : 0
cpu cores : 64
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 4491.76
TLB size : 3072 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]