AMD zen microcode updates breaks boot

From: Jens Axboe
Date: Fri Sep 27 2024 - 11:17:54 EST


Hi,

Got home from conference travels and updated two test boxes to current
-git (sha 075dbe9f6e3c), both AMD boxes. One of them boots fine, the
other one does not. One is a Dell R7525, cpu:

2 socket AMD EPYC 7763 64-Core Processor

and it boots fine on -git. The other is a Dell R7625, cpu:

2 socket AMD EPYC 9754 128-Core Processor

and that one does not boot. Just get a black screen when the kernel
should load. Because I didn't have much to go on here, I bisected the
issue, and it came up with:

94838d230a6c835ced1bad06b8759e0a5f19c1d3 is the first bad commit
commit 94838d230a6c835ced1bad06b8759e0a5f19c1d3 (HEAD)
Author: Borislav Petkov <bp@xxxxxxxxx>
Date: Thu Jul 25 13:20:37 2024 +0200

x86/microcode/AMD: Use the family,model,stepping encoded in the patch ID

which seems plausible. And indeed, reverting that commit (and its fixup)
on top of current -git does indeed solve it. Happy to test patches,
unfortunately I don't have much to offer up in terms of oops or whatever
to help diagnose this. In lieu of instant ideas to prevent this issue on
-rc1, perhaps a revert?

--
Jens Axboe