Re: rom 3540f985652f41041e54ee82aa53e7dbd55739ae Mon Sep 17 00:00:00 2001
From: Sandipan Das
Date: Fri Sep 22 2023 - 06:23:45 EST
On 9/22/2023 3:33 PM, Ingo Molnar wrote:
>
> * Sandipan Das <sandipan.das@xxxxxxx> wrote:
>
>> Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI
>> handler, as shown below, several times while perf runs. A simple
>> `perf top` run is enough to render the system unusable.
>>
>> WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
>>
>> This happens because the Performance Counter Global Status Register
>> (PerfCntGlobalStatus) has one or more bits set which are considered
>> reserved according to the "AMD64 Architecture Programmer???s Manual,
>> Volume 2: System Programming, 24593". The document can be found at
>> https://www.amd.com/system/files/TechDocs/24593.pdf
>>
>> To make this less intrusive, warn just once if any reserved bit is set
>> and prompt the user to update the microcode. Also sanitize the value to
>> what the code is handling, so that the overflow events continue to be
>> handled for the number of counters that are known to be sane.
>>
>> Going forward, the following microcode patch levels are recommended
>> for Zen 4 processors in order to avoid such issues with reserved bits.
>>
>> Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e
>> Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e
>> Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116
>> Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212
>>
>> Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from
>> the linux-firmware tree has binaries that meet the minimum required
>> patch levels.
>>
>> Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
>> Reported-by: Jirka Hladky <jhladky@xxxxxxxxxx>
>> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
>> [sandipan: add message to prompt users to update microcode]
>> [sandipan: rework commit message and call out required microcode levels]
>> Signed-off-by: Sandipan Das <sandipan.das@xxxxxxx>
>
>> v2:
>> - Use pr_warn_once() instead of WARN_ON_ONCE() to prompt users to
>> update microcode
>> - Rework commit message and add details of minimum required microcode
>> patch levels.
>
> 1)
>
> I don't think you ever re-sent this patch with the correct subject line.
> ( Or at least it's not in my mbox. )
>
> 2)
>
> So if the fix is from Breno Leitao originally, then there should be a:
>
> From: Breno Leitao <leitao@xxxxxxxxxx>
>
> at the beginning of the patch to make authorship clear.
>
> You might also want to add:
>
> Co-developed-by: Sandipan Das <sandipan.das@xxxxxxx>
>
> to make your contributions clear.
>
Sorry for the confusion. I did resend this patch with the correct authorship
and it can be found here:
https://lore.kernel.org/all/3540f985652f41041e54ee82aa53e7dbd55739ae.1694696888.git.sandipan.das@xxxxxxx/