Re: [PATCH V2 06/12] virtio_pci: harden MSI-X interrupts

From: Jason Wang
Date: Tue Oct 19 2021 - 21:34:06 EST


On Wed, Oct 20, 2021 at 1:02 AM Dongli Zhang <dongli.zhang@xxxxxxxxxx> wrote:
>
>
>
> On 10/18/21 6:33 PM, Jason Wang wrote:
> > On Sat, Oct 16, 2021 at 1:27 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> >>
> >> On Fri, Oct 15, 2021 at 05:09:38AM -0700, Dongli Zhang wrote:
> >>> Hi Jason,
> >>>
> >>> On 10/11/21 11:52 PM, Jason Wang wrote:
> >>>> We used to synchronize pending MSI-X irq handlers via
> >>>> synchronize_irq(), this may not work for the untrusted device which
> >>>> may keep sending interrupts after reset which may lead unexpected
> >>>> results. Similarly, we should not enable MSI-X interrupt until the
> >>>
> >>> About "unexpected results", while you mentioned below in v1 ...
> >>>
> >>> "My understanding is that e.g in the case of SEV/TDX we don't trust the
> >>> hypervisor. So the hypervisor can keep sending interrupts even if the
> >>> device is reset. The guest can only trust its own software interrupt
> >>> management logic to avoid call virtio callback in this case."
> >>>
> >>> .. and you also mentioned to avoid the panic (due to untrusted device) in as
> >>> many scenarios as possible.
> >>>
> >>>
> >>> Here is my understanding.
> >>>
> >>> The reason we do not trust hypervisor is to protect (1) data/code privacy for
> >>> most of times and sometimes (2) program execution integrity.
> >>>
> >>> The bad thing is: the hypervisor is able to panic/destroy the VM in the worst case.
> >>>
> >>> It is reasonable to re-configure/recover if we assume there is BUG at
> >>> hypervisor/device side. That is, the hypervisor/device is buggy, but not malicious.
> >>>
> >>> However, how about to just detect and report the hypervisor/device is malicious
> >>> and shutdown/panic the VM immediately? If we have detected the malicious
> >>> hypervisor, we should avoid running VMs on the malicious hypervisor further. At
> >>> least how about to print error message to reminder users that the hypervisor is
> >>> malicious?
> >
> > I understand your point, but several questions needs to be answered:
> >
> > 1) how can we easily differentiate "malicious" from "buggy"?
> > 2) If we want to detect malicious hypervisor, virtio is not the only
> > place that we want to do this
> > 3) Is there a way that "malicious" hypervisor can prevent guest from
> > shutting down itself?
> >
> >>>
> >>>
> >>> About "unexpected results", it should not be hang/panic. I suggest ...
> >>>
> >
> > It's just the phenomenon not the logic behind that. It could be e.g
> > OOB which may lead the unexpected kernel codes to be executed in
> > unexpected ways (e.g mark the page as shared or go with IOTLB etc).
> > Sometimes we can see panic finally but it's not always.
>
> This is what I was trying to explain.
>
> The objective is to protect "data privacy" (or code execution integrity in some
> case), but not to prevent DoS attack. That is, the 'malicious' hypervisor should
> not be able to derive VM data privacy.
>
> Suppose the hypervisor did something buggy/malicious when SEV/TDX is used by VM,
> the "unexpected results" should never reveal secure/private data to the hypervisor.
>
> In my own opinion, the threat model is:
>
> Attacker: 'malicious' hypervisor
>
> Victim: VM with SEV/TDX/SGX
>
> The attacker should not be able to steal secure/private data from VM, when the
> hypervisor's action is unexpected. DoS is out of the scope.
>
> My concern is: it is very hard to clearly explain in the patchset how the
> hypervisor is able to steal VM's data, by setting queue=0 or injecting unwanted
> interrupts to VM.

Yes, it's a hard question but instead of trying to answer that, we can
just fix the case of e.g unexpected interrupts.

Thanks

>
> Thank you very much!
>
> Dongli Zhang
>
> >
> >>> Assuming SEV/TDX is involved, the hypervisor should never be able to derive the
> >>> VM privacy (at least secure memory data) by providing malicious configuration,
> >>> e.g., num_queues=0.
> >
> > Yes.
> >
> >>> If we detect hypervisor is malicious, the VM is
> >>> panic/shutdown immediately.
> >
> > This seems to enforce the policy into the kernel, we need to leave
> > this to userspace to decide.
> >
> >>> At least how about to print error message to
> >>> reminder users.
> >
> > This is fine.
> >
> >>>
> >>>
> >>> BTW, while I always do care about the loss of interrupt issue, the malicious
> >>> device is able to hang a VM by dropping a single interrupt on purpose for
> >>> virtio-scsi :)
> >>>
> >
> > Right.
> >
> >>>
> >>> Thank you very much!
> >>
> >>
> >> Can't say I understand what it's about. TDX does not protect against
> >> hypervisor DoS attacks.
> >
> > Yes, I think what Dongli wants to discuss is how to behave if we
> > detect a malicious hypervisor. He suggested a shutdown instead of
> > failing the probe.
> >
> > Thanks
> >
> >>
> >>> Dongli Zhang
> >>>
> >>>> device is ready. So this patch fixes those two issues by:
> >>>>
> >>>> 1) switching to use disable_irq() to prevent the virtio interrupt
> >>>> handlers to be called after the device is reset.
> >>>> 2) using IRQF_NO_AUTOEN and enable the MSI-X irq during .ready()
> >>>>
> >>>> This can make sure the virtio interrupt handler won't be called before
> >>>> virtio_device_ready() and after reset.
> >>>>
> >>>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> >>>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> >>>> Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >>>> Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
> >>>> ---
> >>>> drivers/virtio/virtio_pci_common.c | 27 +++++++++++++++++++++------
> >>>> drivers/virtio/virtio_pci_common.h | 6 ++++--
> >>>> drivers/virtio/virtio_pci_legacy.c | 5 +++--
> >>>> drivers/virtio/virtio_pci_modern.c | 6 ++++--
> >>>> 4 files changed, 32 insertions(+), 12 deletions(-)
> >>>>
> >>>> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> >>>> index b35bb2d57f62..0b9523e6dd39 100644
> >>>> --- a/drivers/virtio/virtio_pci_common.c
> >>>> +++ b/drivers/virtio/virtio_pci_common.c
> >>>> @@ -24,8 +24,8 @@ MODULE_PARM_DESC(force_legacy,
> >>>> "Force legacy mode for transitional virtio 1 devices");
> >>>> #endif
> >>>>
> >>>> -/* wait for pending irq handlers */
> >>>> -void vp_synchronize_vectors(struct virtio_device *vdev)
> >>>> +/* disable irq handlers */
> >>>> +void vp_disable_vectors(struct virtio_device *vdev)
> >>>> {
> >>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >>>> int i;
> >>>> @@ -34,7 +34,20 @@ void vp_synchronize_vectors(struct virtio_device *vdev)
> >>>> synchronize_irq(vp_dev->pci_dev->irq);
> >>>>
> >>>> for (i = 0; i < vp_dev->msix_vectors; ++i)
> >>>> - synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i));
> >>>> + disable_irq(pci_irq_vector(vp_dev->pci_dev, i));
> >>>> +}
> >>>> +
> >>>> +/* enable irq handlers */
> >>>> +void vp_enable_vectors(struct virtio_device *vdev)
> >>>> +{
> >>>> + struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >>>> + int i;
> >>>> +
> >>>> + if (vp_dev->intx_enabled)
> >>>> + return;
> >>>> +
> >>>> + for (i = 0; i < vp_dev->msix_vectors; ++i)
> >>>> + enable_irq(pci_irq_vector(vp_dev->pci_dev, i));
> >>>> }
> >>>>
> >>>> /* the notify function used when creating a virt queue */
> >>>> @@ -141,7 +154,8 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
> >>>> snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
> >>>> "%s-config", name);
> >>>> err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
> >>>> - vp_config_changed, 0, vp_dev->msix_names[v],
> >>>> + vp_config_changed, IRQF_NO_AUTOEN,
> >>>> + vp_dev->msix_names[v],
> >>>> vp_dev);
> >>>> if (err)
> >>>> goto error;
> >>>> @@ -160,7 +174,8 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
> >>>> snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
> >>>> "%s-virtqueues", name);
> >>>> err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
> >>>> - vp_vring_interrupt, 0, vp_dev->msix_names[v],
> >>>> + vp_vring_interrupt, IRQF_NO_AUTOEN,
> >>>> + vp_dev->msix_names[v],
> >>>> vp_dev);
> >>>> if (err)
> >>>> goto error;
> >>>> @@ -337,7 +352,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
> >>>> "%s-%s",
> >>>> dev_name(&vp_dev->vdev.dev), names[i]);
> >>>> err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
> >>>> - vring_interrupt, 0,
> >>>> + vring_interrupt, IRQF_NO_AUTOEN,
> >>>> vp_dev->msix_names[msix_vec],
> >>>> vqs[i]);
> >>>> if (err)
> >>>> diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
> >>>> index beec047a8f8d..a235ce9ff6a5 100644
> >>>> --- a/drivers/virtio/virtio_pci_common.h
> >>>> +++ b/drivers/virtio/virtio_pci_common.h
> >>>> @@ -102,8 +102,10 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
> >>>> return container_of(vdev, struct virtio_pci_device, vdev);
> >>>> }
> >>>>
> >>>> -/* wait for pending irq handlers */
> >>>> -void vp_synchronize_vectors(struct virtio_device *vdev);
> >>>> +/* disable irq handlers */
> >>>> +void vp_disable_vectors(struct virtio_device *vdev);
> >>>> +/* enable irq handlers */
> >>>> +void vp_enable_vectors(struct virtio_device *vdev);
> >>>> /* the notify function used when creating a virt queue */
> >>>> bool vp_notify(struct virtqueue *vq);
> >>>> /* the config->del_vqs() implementation */
> >>>> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> >>>> index d62e9835aeec..bdf6bc667ab5 100644
> >>>> --- a/drivers/virtio/virtio_pci_legacy.c
> >>>> +++ b/drivers/virtio/virtio_pci_legacy.c
> >>>> @@ -97,8 +97,8 @@ static void vp_reset(struct virtio_device *vdev)
> >>>> /* Flush out the status write, and flush in device writes,
> >>>> * including MSi-X interrupts, if any. */
> >>>> ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
> >>>> - /* Flush pending VQ/configuration callbacks. */
> >>>> - vp_synchronize_vectors(vdev);
> >>>> + /* Disable VQ/configuration callbacks. */
> >>>> + vp_disable_vectors(vdev);
> >>>> }
> >>>>
> >>>> static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
> >>>> @@ -194,6 +194,7 @@ static void del_vq(struct virtio_pci_vq_info *info)
> >>>> }
> >>>>
> >>>> static const struct virtio_config_ops virtio_pci_config_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = vp_get,
> >>>> .set = vp_set,
> >>>> .get_status = vp_get_status,
> >>>> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> >>>> index 30654d3a0b41..acf0f6b6381d 100644
> >>>> --- a/drivers/virtio/virtio_pci_modern.c
> >>>> +++ b/drivers/virtio/virtio_pci_modern.c
> >>>> @@ -172,8 +172,8 @@ static void vp_reset(struct virtio_device *vdev)
> >>>> */
> >>>> while (vp_modern_get_status(mdev))
> >>>> msleep(1);
> >>>> - /* Flush pending VQ/configuration callbacks. */
> >>>> - vp_synchronize_vectors(vdev);
> >>>> + /* Disable VQ/configuration callbacks. */
> >>>> + vp_disable_vectors(vdev);
> >>>> }
> >>>>
> >>>> static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
> >>>> @@ -380,6 +380,7 @@ static bool vp_get_shm_region(struct virtio_device *vdev,
> >>>> }
> >>>>
> >>>> static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = NULL,
> >>>> .set = NULL,
> >>>> .generation = vp_generation,
> >>>> @@ -397,6 +398,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
> >>>> };
> >>>>
> >>>> static const struct virtio_config_ops virtio_pci_config_ops = {
> >>>> + .ready = vp_enable_vectors,
> >>>> .get = vp_get,
> >>>> .set = vp_set,
> >>>> .generation = vp_generation,
> >>>>
> >>
> >
>