Re: [PATCHv7 21/30] x86/acpi, x86/boot: Add multiprocessor wake-up support

From: Kirill A. Shutemov
Date: Thu Mar 24 2022 - 11:24:07 EST


On Fri, Mar 18, 2022 at 11:23:59AM -0700, Dave Hansen wrote:
> On 3/18/22 08:30, Kirill A. Shutemov wrote:
> > + /*
> > + * After writing the wakeup command, wait for maximum timeout of 0xFF
> > + * for firmware to reset the command address back zero to indicate
> > + * the successful reception of command.
> > + * NOTE: 0xFF as timeout value is decided based on our experiments.
> > + *
> > + * XXX: Change the timeout once ACPI specification comes up with
> > + * standard maximum timeout value.
> > + */
> > + timeout = 0xFF;
> > + while (READ_ONCE(acpi_mp_wake_mailbox->command) && --timeout)
> > + cpu_relax();
>
> I don't feel like this was ever actually resolved. This timeout
> basically boiled down to "this value seems to work for us". There are
> also *SURELY* timeouts that are going to happen here.

It makes me think if it can be an attack vector: once kernel initiated
wake up of a secondary vCPU it has no control on how long it takes.

I worry that malicious VMM can induce timeout intentionally, but then wake
up the secondary CPU when kernel doesn't expect it. After quick look I was
not able to convince myself that kernel can deal with this without a
problem.

Is it legitimate concern?

Patch below drops timeout handling completely. Any opinions?

Other option would be to check in the trampoline code that initiated wake
up is legitimate. But it should only be untrue if VMM acting weird (or
virtual BIOS is buggy). I don't think it's right side to deal with the
problem.

> > /*
> > * If the CPU wakeup process is successful, store the
> > * status in apic_id_wakemap to prevent re-wakeup
> > * requests.
> > */
> > physid_set(apicid, apic_id_wakemap);
> >
> > return 0;
> > }
>
> If this goes wrong, won't the new wakeup just timeout? Do we really
> need a dedicated mechanism to stop re-wakeups? How much of a problem is
> this going to be?

Well, it can provide a proper diagnostics and a distinct error code. If
you think it is unneeded we can drop it.