RE: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
From: Reshetova, Elena
Date: Thu Oct 02 2025 - 04:10:44 EST
> On 02.10.25 08:59, Reshetova, Elena wrote:
> >> On Wed, Oct 1, 2025 at 7:32 AM Dave Hansen <dave.hansen@xxxxxxxxx>
> >> wrote:
> >>>
> >>> On 9/30/25 19:05, Vishal Annapurve wrote:
> >>> ...
> >>>>> Any workarounds are going to be slow and probably imperfect. That's
> not
> >>>>
> >>>> Do we really need to deploy workarounds that are complex and slow to
> >>>> get kdump working for the majority of the scenarios? Is there any
> >>>> analysis done for the risk with imperfect and simpler workarounds vs
> >>>> benefits of kdump functionality?
> >>>>
> >>>>> a great match for kdump. I'm perfectly happy waiting for fixed hardware
> >>>>> from what I've seen.
> >>>>
> >>>> IIUC SPR/EMR - two CPU generations out there are impacted by this
> >>>> erratum and just disabling kdump functionality IMO is not the best
> >>>> solution here.
> >>>
> >>> That's an eminently reasonable position. But we're speaking in broad
> >>> generalities and I'm unsure what you don't like about the status quo or
> >>> how you'd like to see things change.
> >>
> >> Looks like the decision to disable kdump was taken between [1] -> [2].
> >> "The kernel currently doesn't track which page is TDX private memory.
> >> It's not trivial to reset TDX private memory. For simplicity, this
> >> series simply disables kexec/kdump for such platforms. This will be
> >> enhanced in the future."
> >>
> >> A patch [3] from the series[1], describes the issue as:
> >> "This problem is triggered by "partial" writes where a write transaction
> >> of less than cacheline lands at the memory controller. The CPU does
> >> these via non-temporal write instructions (like MOVNTI), or through
> >> UC/WC memory mappings. The issue can also be triggered away from the
> >> CPU by devices doing partial writes via DMA."
> >>
> >> And also mentions:
> >> "Also note only the normal kexec needs to worry about this problem, but
> >> not the crash kexec: 1) The kdump kernel only uses the special memory
> >> reserved by the first kernel, and the reserved memory can never be used
> >> by TDX in the first kernel; 2) The /proc/vmcore, which reflects the
> >> first (crashed) kernel's memory, is only for read. The read will never
> >> "poison" TDX memory thus cause unexpected machine check (only partial
> >> write does)."
> >
> > While the statement that the read will never poison the memory is correct,
> > the situation we can theoretically worry about is the following in my
> understanding:
> >
> > 1. During its execution on platform with partial write problem, host OS or
> other
> > actor executing outside of SEAM mode triggers partial write into a cache line
> that
> > originally belonged to TDX private memory.
> > This is smth that host OS or other entities should not do, but it could happen
> due
> > to host OS bugs, etc.
> > 2. The above causes the specified cache line to be poisoned by mem
> controller.
> > However, here we assume that no one accesses this cache line from TDX
> module,
> > TD guests or Host OS for the time being and the problem remains hidden.
> > 3. Host OS crashes due to some other issue, kdump crash kernel is triggered,
> > and kdump starts to read all the memory from the previous host kernel to
> dump
> > the diagnostics info.
> > 4. At some point of time, kdump crash kernel reaches the memory with the
> poisoned
> > cache line, consumes poison, and the #MC is issued for the kernel space.
> >
> > Isn't this the reason for also disabling kdump? Or do I miss smth?
>
> So lets compare the 2 cases with kdump enabled and disabled in your scenario
> (crash of the host OS):
>
> kdump enabled: No dump can be produced due to the #MC and system is
> rebooted.
>
> kdump disabled: No dump is produced and system is rebooted after crash.
>
> What is the main concern with kdump enabled? I don't see any disadvantage
> with
> enabling it, just the advantage that in many cases a dump will be written.
I am not in the position to judge about what should be done about kdump in Linux,
neither I am arguing one way or another.
I just wanted to fill the gap and explain the technical scenario above
which I think was missing from this thread. Whatever decision is taken by
community should rely on understanding the HW behaviour, so this is what
I tried to explain above.
Best Regards,
Elena.