Re: [PATCH 7/9] virtio-pci: harden INTX interrupts

From: Peter Zijlstra
Date: Tue Sep 14 2021 - 07:04:45 EST


On Mon, Sep 13, 2021 at 11:36:24PM +0200, Thomas Gleixner wrote:

> That's the real problem and for that your barrier is at the wrong place
> because you want to make sure that those stores are visible before the
> store to intx_soft_enabled becomes visible, i.e. this should be:
>
>
> /* Ensure that all preceeding stores are visible before intx_soft_enabled */
> smp_wmb();
> vp_dev->intx_soft_enabled = true;

That arguably wants to be smp_store_release() instead of smp_wmb() :-)

> Now Micheal is not really enthusiatic about the barrier in the interrupt
> handler hotpath, which is understandable.
>
> As the device startup is not really happening often it's sensible to do
> the following
>
> disable_irq();
> vp_dev->intx_soft_enabled = true;
> enable_irq();
>
> because:
>
> disable_irq()
> synchronize_irq()
>
> acts as a barrier for the preceeding stores:
>
> disable_irq()
> raw_spin_lock(desc->lock);
> __disable_irq(desc);
> raw_spin_unlock(desc->lock);
>
> synchronize_irq()
> do {
> raw_spin_lock(desc->lock);
> in_progress = check_inprogress(desc);
> raw_spin_unlock(desc->lock);
> } while (in_progress);

Here you rely on the UNLOCK+LOCK pattern because we have two adjacent
critical sections (or rather, the same twice), which provides RCtso
ordering, which is sufficient to make the below store:

>
> intx_soft_enabled = true;

a RELEASE. still, I would suggest writing it at least using
WRITE_ONCE() with a comment on.

disable_irq();
/*
* The above disable_irq() provides TSO ordering and as such
* promotes the below store to store-release.
*/
WRITE_ONCE(intx_soft_enabled, true);
enable_irq();

> In this case synchronize_irq() prevents the subsequent store to
> intx_soft_enabled to leak into the __disable_irq(desc) section which in
> turn makes it impossible for an interrupt handler to observe
> intx_soft_enabled == true before the prerequisites which preceed the
> call to disable_irq() are visible.
>
> Of course the memory ordering wizards might disagree, but if they do,
> then we have a massive chase of ordering problems vs. similar constructs
> all over the tree ahead of us.

Your case, UNLOCK s + LOCK s, is fully documented to provide RCtso
ordering. The more general case of: UNLOCK r + LOCK s, will shortly
appear in documentation near you. Meaning we can forget about the
details an blanket state that any UNLOCK followed by a LOCK (on the same
CPU) will provide TSO ordering.