Re: [PATCH] PCI/MSI: Avoid torn updates to MSI pairs

From: Evan Green
Date: Wed Jan 22 2020 - 13:27:18 EST


On Wed, Jan 22, 2020 at 9:28 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> [+cc Thomas, Marc, Christoph, Rajat]
>
> On Thu, Jan 16, 2020 at 01:31:28PM -0800, Evan Green wrote:
> > __pci_write_msi_msg() updates three registers in the device: address
> > high, address low, and data. On x86 systems, address low contains
> > CPU targeting info, and data contains the vector. The order of writes
> > is address, then data.
> >
> > This is problematic if an interrupt comes in after address has
> > been written, but before data is updated, and the SMP affinity of
> > the interrupt is changing. In this case, the interrupt targets the
> > wrong vector on the new CPU.
> >
> > This case is pretty easy to stumble into using xhci and CPU hotplugging.
> > Create a script that targets interrupts at a set of cores and then
> > offlines those cores. Put some stress on USB, and then watch xhci lose
> > an interrupt and die.
> >
> > Avoid this by disabling MSIs during the update.
> >
> > Signed-off-by: Evan Green <evgreen@xxxxxxxxxxxx>
> > ---
> >
> >
> > Bjorn,
> > I was unsure whether disabling MSIs temporarily is actually an okay
> > thing to do. I considered using the mask bit, but got the impression
> > that not all devices support the mask bit. Let me know if this going to
> > cause problems or there's a better way. I can include the repro
> > script I used to cause mayhem if needed.
>
> I suspect this *is* a problem because I think disabling MSI doesn't
> disable interrupts; it just means the device will interrupt using INTx
> instead of MSI. And the driver is probably not prepared to handle
> INTx.
>
> PCIe r5.0, sec 7.7.1.2, seems relevant: "If MSI and MSI-X are both
> disabled, the Function requests servicing using INTx interrupts (if
> supported)."
>
> Maybe the IRQ guys have ideas about how to solve this?
>

But don't we already do this in __pci_restore_msi_state():
pci_intx_for_msi(dev, 0);
pci_msi_set_enable(dev, 0);
arch_restore_msi_irqs(dev);

I'd think if there were a chance for a line-based interrupt to get in
and wedge itself, it would already be happening there.

One other way you could avoid torn MSI writes would be to ensure that
if you migrate IRQs across cores, you keep the same x86 vector number.
That way the address portion would be updated, and data doesn't
change, so there's no window. But that may not actually be feasible.

-Evan