Re: SEV-ES guest shutdown: linux-next regression with QEMU 10.2.2; smp>1

From: Tom Lendacky

Date: Wed Apr 01 2026 - 13:43:34 EST


On 4/1/26 11:18, Sean Christopherson wrote:
> On Wed, Apr 01, 2026, Srikanth Aithal wrote:
>> On 4/1/2026 6:23 PM, Tom Lendacky wrote:
>>> On 4/1/26 06:24, Aithal, Srikanth wrote:
>>>> Hello Tom,
>>>>
>>>>
>>>> On 3/23/2026 6:40 PM, Tom Lendacky wrote:
>>>>> On 3/20/26 11:08, Aithal, Srikanth wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am hitting a failure when shutting down a SEV-ES guest (smp>1) on
>>>>>> recent linux-next, and narrowed it down with bisection on the host
>>>>>> kernel. The issue appears with more than one vCPU (e.g. -smp 2); with -
>>>>>> smp 1 shutdown completes normally in my tests. The same guest shutdown
>>>>>> path works with an older host kernel (<next-20260304) and is also
>>>>>> avoided with current QEMU master or by cherry-picking a specific QEMU
>>>>>> commit onto v10.2.2.
>>>>>>
>>>>>> Environment:
>>>>>> Host kernel: linux-next, tag next-20260319 [1] (also observed starting
>>>>>> from next-20260304).
>>>>>> Guest: SEV-ES Linux guest; -smp 2 (or more) reproduces the issue; -smp 1
>>>>>> does not in my testing.
>>>>>> Hypervisor / QEMU: Initially QEMU v10.2.2 (stable). Later tested QEMU
>>>>>> master at 8e711856d763 [2].
>>>>>>
>>>>>> Details on issue:
>>>>>>
>>>>>> After SEV-ES guest shutdown , the serial log shows a register dump
>>>>>> (example below) .
>>>>>>
>>>>>> [   12.613383] reboot: Power down^M
>>>>>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00a00f11
>>>>>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>>>>>> EIP=0000b004 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
>>>>>> ES =0000 00000000 0000ffff 00009300
>>>>>> CS =f000 00800000 0000ffff 00009b00
>>>>>> SS =0000 00000000 0000ffff 00009300
>>>>>> DS =0000 00000000 0000ffff 00009300
>>>>>> FS =0000 00000000 0000ffff 00009300
>>>>>> GS =0000 00000000 0000ffff 00009300
>>>>>> LDT=0000 00000000 0000ffff 00008200
>>>>>> TR =0000 00000000 0000ffff 00008b00
>>>>>> GDT=     00000000 0000ffff
>>>>>> IDT=     00000000 0000ffff
>>>>>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>>>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>>>>>> DR3=0000000000000000
>>>>>> DR6=00000000ffff0ff0 DR7=0000000000000400
>>>>>> EFER=0000000000000000
>>>>>> Code=b0 96 b5 61 ca ef 3f 51 00 c3 65 51 19 77 b1 e0 e5 e2 91 b8 <0c> 5d
>>>>>> c7 fc 59 bc 2b 6f 90 89 44 23 ec ec 2f 62 fd e0 8f d5 c7 31 24 70 e2 7d
>>>>>> c6 ee 00 00
>>>>>> -> Hangs here
>>>>>>
>>>>>> Host kernel bisect (with QEMU v10.2.2) led to:
>>>>>>
>>>>>> Good (no crash on guest shutdown):
>>>>>> 32d76cdfa1222c88262da5b12e0b2bba444c96fa
>>>>>> KVM: SVM: Move core EFER.SVME enablement to kernel (local build tagged
>>>>>> 7.0.0-rc232d76cdfa1222 during testing.)
>>>>>>
>>>>>> Bad (crash reproduced):
>>>>>> 428afac5a8ea9c55bb8408e02dc92b8f85bf5f30
>>>>>> KVM: x86: Move bulk of emergency virtualization logic to virt subsystem
>>>>>
>>>>> Any chance you have the enable_virt_at_load module option set to false?
>>>>
>>>> No, it is set to Y.
>>>> # cat /sys/module/kvm/parameters/enable_virt_at_load
>>>> Y
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> So the first bad commit in my host kernel bisect was 428afac5a8ea. The
>>>>>> commit prior [32d76cdfa122] did not have this issue.
>>>>>>
>>>>>> Later I used QEMU master and with same linux-next next-20260319 as host,
>>>>>> it did not reproduce the shutdown issue .. that was using QEMU master
>>>>>> [2].
>>>>>>
>>>>>> QEMU master contains 56d89db2cfd82c53439778fbf39294bf35194dba (target/
>>>>>> i386: convert SEV-ES termination requests to guest panic events).
>>>>>> Cherry-picking that commit onto QEMU v10.2.2 resolved or at least
>>>>>> avoided the shutdown crash in my setup.
>>>>>
>>>>> Well, if it is converting a guest termination request, that is still not
>>>>> good. It should be a clean shutdown.
>>>>>
>>>>>>
>>>>>>
>>>>>> Questions:
>>>>>>
>>>>>> KVM: Is the interaction/issue with older QEMU (e.g. v10.2.2) expected
>>>>>> here, or is there anything that should be adjusted or documented
>>>>>> following 428afac5a8ea, like for multi-vCPU SEV-ES guests?
>>>>>> QEMU: Would a stable backport of 56d89db2cfd8 to 10.2.x (or equivalent
>>>>>> handling of SEV-ES termination) be appropriate for users staying on
>>>>>> stable QEMU while moving to newer host kernels?
>>>>>
>>>>> That wouldn't actually solve the issue, it is just a much more user
>>>>> friendly error message. Is there a termination event in the host dmesg
>>>>> log?
>>>>
>>>>
>>>> The guest shutdown proceeds normally until:
>>>> [ OK ] Reached target poweroff.target - System Power Off.
>>>> [ 9.918849] reboot: Power down
>>>>
>>>> At that point the serial console freezes with the register dump below
>>>> (EIP=0000b004, HLT=1, EFER=0, etc.).
>>>>
>>>> [  OK  ] Finished systemd-poweroff.service - System Power Off.
>>>> [  OK  ] Reached target poweroff.target - System Power Off.
>>>> [   10.029330] reboot: Power down
>>>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00a00f11
>>>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>>>> EIP=0000b004 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
>>>> ES =0000 00000000 0000ffff 00009300
>>>> CS =f000 00800000 0000ffff 00009b00
>>>> SS =0000 00000000 0000ffff 00009300
>>>> DS =0000 00000000 0000ffff 00009300
>>>> FS =0000 00000000 0000ffff 00009300
>>>> GS =0000 00000000 0000ffff 00009300
>>>> LDT=0000 00000000 0000ffff 00008200
>>>> TR =0000 00000000 0000ffff 00008b00
>>>> GDT=     00000000 0000ffff
>>>> IDT=     00000000 0000ffff
>>>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>>>> DR3=0000000000000000
>>>> DR6=00000000ffff0ff0 DR7=0000000000000400
>>>> EFER=0000000000000000
>>>> Code=98 59 0c db 72 6c 94 71 3d a6 36 32 49 a8 08 22 bd d7 8c bb <4c> 3c
>>>> d9 bd 90 b5 2e a0 69 26 53 df aa 4c bb fe 5a d9 b6 ee 7b 45 02 2e cf d9
>>>> 60 48 00 00
>>>>
>>>>
>>>>
>>>> QEMU does **not** exit on its own — it appears stuck.
>>>>
>>>> Only after I press Ctrl+C do I see in host dmesg:
>>>> kvm_amd: SEV-ES guest requested termination: 0x0:0x0
>>>
>>> So we would have to see what is triggering that termination request.
>
> IIUC, the termination request only occurs after CTRL+C. If that's correct, it's
> a red herring, and the real question is why a graceful shutdown hangs.
>
>>> We can probably instrument a guest kernel to get some more info.
>>
>> Sure, I can apply any debug patch and provide the debug logs.
>>
>> Note: I'm heading out on PTO until next Wednesday (April 8th). I won't be
>> able to gather additional debug logs until I return.
>>
>>>
>>>>
>>>> I also set `kvm_amd.dump_invalid_vmcb=1` before reproducing, but it
>>>> produced no additional output.
>>>>
>>>> This issue is still present on latest linux-next next-20260331 [https://
>>>> git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tag/?
>>>> h=next-20260331, tag name    next-20260331
>>>> (e5da3eef8dadab4e98b228725ca8948edd9d601f)]
>>>
>>> Is it only with linux-next? Which would point to a kernel change vs a
>>> Qemu change.
>>
>> Yes, it is specific to linux-next (starting from next-20260304, bisected to
>> 428afac5a8ea "KVM: x86: Move bulk of emergency virtualization logic to virt
>> subsystem").
>>
>> - With older host kernels (< next-20260304) + QEMU 10.2.2 → clean shutdown
>> (no hang, no termination message, QEMU exits normally).
>> - With linux-next (next-20260331) + QEMU 10.2.2 → hang at the register dump
>> after "reboot: Power down"; only Ctrl+C triggers the "SEV-ES guest requested
>> termination: 0x0:0x0" message.
>> - With linux-next + QEMU master (or 10.2.2 + cherry-pick of 56d89db2cfd8) →
>> no hang (the termination is converted to a guest panic instead).
>
> What guest kernel are you using? Bisecting to that commit for just the *host*
> kernel is baffling. I could see it preventing KVM from loading or something, but
> it should be completely out of scope with respect to guest activity.
>
> How are you initiating shutdown withing the guest? What's the full QEMU command
> line?
>
> Can you also provide the OVMF image? E.g. in case the hang occurs in EFI runtime
> services or something.
>
> I want to get this sorted out before the merge window and so would prefer not to
> delay root causing this by a week or more.

I just tried linux-next/next-20260401 with Qemu 10.2.2 and the same
linux-next kernel in a guest and was unable to recreate the issue. So, a
re-test with the latest linux-next would be good to see if the issue
remains or, yes, more information is needed (including the kernel config
used for the build, too).

Thanks,
Tom