Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio

From: Jason Gunthorpe
Date: Tue Apr 20 2021 - 19:18:39 EST

Next message: patchwork-bot+netdevbpf: "Re: [net-next] net: dsa: felix: disable always guard band bit for TAS config"
Previous message: James Bottomley: "Re: [PATCH v9 1/4] KEYS: trusted: Add generic trusted keys framework"
In reply to: Alex Williamson: "Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio"
Next in thread: liulongfang: "Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Apr 20, 2021 at 04:04:57PM -0600, Alex Williamson wrote:

> > The migration control registers must be on a different VF from the VF
> > being plugged into a guest and the two VFs have to be in different
> > IOMMU groups to ensure they are isolated from each other.
>
> I think that's a solution, I don't know if it's the only solution.

Maybe, but that approach does offer DMA access for the migration. For
instance to implement something that needs a lot of data like
migrating a complicated device state, or dirty page tracking or
whatver.

This driver seems very simple - it has only 17 state elements - and
doesn't use DMA.

I can't quite tell, but does this pass the hypervisor BAR into the
guest anyhow? That would certainly be an adquate statement that it is
safe, assuming someone did a good security analysis.

> ways and it's not very interesting. If the user can manipulate device
> state in order to trigger an exploit of the host-side kernel driver,
> that's obviously more of a problem.

Well, for instance, we have an implementation of
(VFIO_DEVICE_STATE_SAVING | VFIO_DEVICE_STATE_RUNNING) which means the
guest CPUs are still running and a hostile guest can be manipulating
the device.

But this driver is running code, like vf_qm_state_pre_save() in this
state. Looks very suspicious.

One quick attack I can imagine is to use the guest CPU to DOS the
migration and permanently block it, eg by causing qm_mb() or other
looping functions to fail.

There may be worse things possible, it is a bit hard to tell just from
the code.

.. also drivers should not be open coding ARM assembly as in
qm_mb_write()

.. and also, code can not randomly call pci_get_drvdata() on a struct
device it isn't attached to haven't verified the right driver is
bound, or locked correctly.

> manipulate the BAR size to expose only the operational portion of MMIO
> to the VM and use the remainder to support migration itself. I'm
> afraid that just like mdev, the vfio migration uAPI is going to be used
> as an excuse to create kernel drivers simply to be able to make use of
> that uAPI.

I thought that is the general direction people had agreed on during
the IDXD mdev discussion?

People want the IOCTLs from VFIO to be the single API to program all
the VMMs to and to not implement user space drivers..

This actually seems like a great candidate for a userspace driver.

I would like to know we are still settled on this direction as the
mlx5 drivers we are working on also have some complicated option to be
user space only.

Jason

Next message: patchwork-bot+netdevbpf: "Re: [net-next] net: dsa: felix: disable always guard band bit for TAS config"
Previous message: James Bottomley: "Re: [PATCH v9 1/4] KEYS: trusted: Add generic trusted keys framework"
In reply to: Alex Williamson: "Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio"
Next in thread: liulongfang: "Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]