Re: unknown NMI on AMD Rome

From: Kim Phillips
Date: Tue Mar 16 2021 - 16:03:21 EST


On 3/16/21 2:53 PM, Peter Zijlstra wrote:
> On Tue, Mar 16, 2021 at 04:45:02PM +0100, Jiri Olsa wrote:
>> hi,
>> when running 'perf top' on AMD Rome (/proc/cpuinfo below)
>> with fedora 33 kernel 5.10.22-200.fc33.x86_64
>>
>> we got unknown NMI messages:
>>
>> [ 226.700160] Uhhuh. NMI received for unknown reason 3d on CPU 90.
>> [ 226.700162] Do you have a strange power saving mode enabled?
>> [ 226.700163] Dazed and confused, but trying to continue
>> [ 226.769565] Uhhuh. NMI received for unknown reason 3d on CPU 84.
>> [ 226.769566] Do you have a strange power saving mode enabled?
>> [ 226.769567] Dazed and confused, but trying to continue
>> [ 226.769771] Uhhuh. NMI received for unknown reason 2d on CPU 24.
>> [ 226.769773] Do you have a strange power saving mode enabled?
>> [ 226.769774] Dazed and confused, but trying to continue
>> [ 226.812844] Uhhuh. NMI received for unknown reason 2d on CPU 23.
>> [ 226.812846] Do you have a strange power saving mode enabled?
>> [ 226.812847] Dazed and confused, but trying to continue
>> [ 226.893783] Uhhuh. NMI received for unknown reason 2d on CPU 27.
>> [ 226.893785] Do you have a strange power saving mode enabled?
>> [ 226.893786] Dazed and confused, but trying to continue
>> [ 226.900139] Uhhuh. NMI received for unknown reason 2d on CPU 40.
>> [ 226.900141] Do you have a strange power saving mode enabled?
>> [ 226.900143] Dazed and confused, but trying to continue
>> [ 226.908763] Uhhuh. NMI received for unknown reason 3d on CPU 120.
>> [ 226.908765] Do you have a strange power saving mode enabled?
>> [ 226.908766] Dazed and confused, but trying to continue
>> [ 227.751296] Uhhuh. NMI received for unknown reason 2d on CPU 83.
>> [ 227.751298] Do you have a strange power saving mode enabled?
>> [ 227.751299] Dazed and confused, but trying to continue
>> [ 227.752937] Uhhuh. NMI received for unknown reason 3d on CPU 23.
>>
>> also when discussing ths with Borislav, he managed to reproduce easily
>> on his AMD Rome machine
>>
>> any idea?
>
> Kim is the AMD point person for this I think..

Since perf top invokes precision and therefore IBS,
this looks like it's hitting erratum #1215:

https://developer.amd.com/wp-content/resources/56323-PUB_0.78.pdf

Kim