On Tue, Aug 24, 2021, Borislav Petkov wrote:
On Wed, Aug 04, 2021 at 11:13:25AM -0700, Kuppuswamy Sathyanarayanan wrote:
+static __cpuidle void tdg_safe_halt(void)
+{
+ u64 ret;
+
+ /*
+ * Enable interrupts next to the TDVMCALL to avoid
+ * performance degradation.
That comment needs some more love to say exactly what the problem is.
LOL, I guess hanging the vCPU counts as degraded performance. But this comment
can and should go away entirely...
+ */
+ local_irq_enable();
...because this is broken. It's also disturbing because it suggests that these
patches are not being tested.
The STI _must_ immediately precede TDCALL, and it _must_ execute with interrupts
disabled. The whole point of the STI blocking shadow is to ensure interrupts are
blocked until _after_ the HLT completes so that a wake event is not recongized
before the HLT, in which case the vCPU will get stuck in HLT because its wake
event alreadyfired. Enabling IRQs well before the TDCALL defeats the purpose of
the STI dance in __tdx_hypercall().
There's even a massive comment in __tdx_hypercall() explaining all this...
+
+ /* IRQ is enabled, So set R12 as 0 */
It would be helpful to use local variables to document what's up, e.g.
const bool irqs_enabled = true;
const bool do_sti = true;
ret = _tdx_hypercall(EXIT_REASON_HLT, irqs_enabled0, 0, 0, do_sti, NULL);
+ ret = _tdx_hypercall(EXIT_REASON_HLT, 0, 0, 0, 1, NULL);
+
+ /* It should never fail */
+ BUG_ON(ret);
+}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette