Re: [PATCH v1 0/2] Add ablility of VFIO driver to ignore reset when device don't need it

From: Zhenguo Yao
Date: Thu Oct 14 2021 - 09:38:00 EST


OK. Thank you. Let's waitting for NVIDIA's solution.

Alex Williamson <alex.williamson@xxxxxxxxxx> 于2021年10月14日周四 下午8:48写道:
>
> On Thu, 14 Oct 2021 17:57:46 +0800
> Zhenguo Yao <yaozhenguo1@xxxxxxxxx> wrote:
>
> > In some scenarios, vfio device can't do any reset in initialization
> > process. For example: Nvswitch and GPU A100 working in Shared NVSwitch
> > Virtualization Model. In such mode, there are two type VMs: service
> > VM and Guest VM. The GPU devices are initialized in the following steps:
> >
> > 1. Service VM boot up. GPUs and Nvswitchs are passthrough to service VM.
> > Nvidia driver and manager software will do some settings in service VM.
> >
> > 2. The selected GPUs are unpluged from service VM.
> >
> > 3. Guest VM boots up with the selected GPUs passthrough.
> >
> > The selected GPUs can't do any reset in step3, or they will be initialized
> > failed in Guest VM.
> >
> > This patchset add a PCI sysfs interface:ignore_reset which drivers can
> > use it to control whether to do PCI reset or not. For example: In Shared
> > NVSwitch Virtualization Model. Hypervisor can disable PCI reset by setting
> > ignore_reset to 1 before Gust VM booting up.
> >
> > Zhenguo Yao (2):
> > PCI: Add ignore_reset sysfs interface to control whether do device
> > reset in PCI drivers
> > vfio-pci: Don't do device reset when ignore_reset is setting
> >
> > drivers/pci/pci-sysfs.c | 25 +++++++++++++++++
> > drivers/vfio/pci/vfio_pci_core.c | 48 ++++++++++++++++++++------------
> > include/linux/pci.h | 1 +
> > 3 files changed, 56 insertions(+), 18 deletions(-)
> >
>
> This all seems like code to mask that these NVSwitch configurations are
> probably insecure because we can't factor and manage NVSwitch isolation
> into IOMMU grouping. I'm guessing this "service VM" pokes proprietary
> registers to manage that isolation and perhaps later resetting devices
> negates that programming. A more proper solution is probably to do our
> best to guess the span of an NVSwitch configuration and make the IOMMU
> group include all the devices, until NVIDIA provides proper code for
> the kernel to understand this interconnect and how it affects DMA
> isolation. Nak on disabling resets for the purpose of preventing a
> user from undoing proprietary device programming. Thanks,
>
> Alex
>