Re: Plan for /dev/ioasid RFC v2

From: Liu Yi L
Date: Wed Jun 16 2021 - 23:41:24 EST


Hi Alex,

On Wed, 16 Jun 2021 13:39:37 -0600, Alex Williamson wrote:

> On Wed, 16 Jun 2021 06:43:23 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
> > > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > Sent: Wednesday, June 16, 2021 12:12 AM
> > >
> > > On Tue, 15 Jun 2021 02:31:39 +0000
> > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > >
> > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > > > Sent: Tuesday, June 15, 2021 12:28 AM
> > > > >
> > > > [...]
> > > > > > IOASID. Today the group fd requires an IOASID before it hands out a
> > > > > > device_fd. With iommu_fd the device_fd will not allow IOCTLs until it
> > > > > > has a blocked DMA IOASID and is successefully joined to an iommu_fd.
> > > > >
> > > > > Which is the root of my concern. Who owns ioctls to the device fd?
> > > > > It's my understanding this is a vfio provided file descriptor and it's
> > > > > therefore vfio's responsibility. A device-level IOASID interface
> > > > > therefore requires that vfio manage the group aspect of device access.
> > > > > AFAICT, that means that device access can therefore only begin when all
> > > > > devices for a given group are attached to the IOASID and must halt for
> > > > > all devices in the group if any device is ever detached from an IOASID,
> > > > > even temporarily. That suggests a lot more oversight of the IOASIDs by
> > > > > vfio than I'd prefer.
> > > > >
> > > >
> > > > This is possibly the point that is worthy of more clarification and
> > > > alignment, as it sounds like the root of controversy here.
> > > >
> > > > I feel the goal of vfio group management is more about ownership, i.e.
> > > > all devices within a group must be assigned to a single user. Following
> > > > the three rules defined by Jason, what we really care is whether a group
> > > > of devices can be isolated from the rest of the world, i.e. no access to
> > > > memory/device outside of its security context and no access to its
> > > > security context from devices outside of this group. This can be achieved
> > > > as long as every device in the group is either in block-DMA state when
> > > > it's not attached to any security context or attached to an IOASID context
> > > > in IOMMU fd.
> > > >
> > > > As long as group-level isolation is satisfied, how devices within a group
> > > > are further managed is decided by the user (unattached, all attached to
> > > > same IOASID, attached to different IOASIDs) as long as the user
> > > > understands the implication of lacking of isolation within the group. This
> > > > is what a device-centric model comes to play. Misconfiguration just hurts
> > > > the user itself.
> > > >
> > > > If this rationale can be agreed, then I didn't see the point of having VFIO
> > > > to mandate all devices in the group must be attached/detached in
> > > > lockstep.
> > >
> > > In theory this sounds great, but there are still too many assumptions
> > > and too much hand waving about where isolation occurs for me to feel
> > > like I really have the complete picture. So let's walk through some
> > > examples. Please fill in and correct where I'm wrong.
> >
> > Thanks for putting these examples. They are helpful for clearing the
> > whole picture.
> >
> > Before filling in let's first align on what is the key difference between
> > current VFIO model and this new proposal. With this comparison we'll
> > know which of following questions are answered with existing VFIO
> > mechanism and which are handled differently.
> >
> > With Yi's help we figured out the current mechanism:
> >
> > 1) vfio_group_viable. The code comment explains the intention clearly:
> >
> > --
> > * A vfio group is viable for use by userspace if all devices are in
> > * one of the following states:
> > * - driver-less
> > * - bound to a vfio driver
> > * - bound to an otherwise allowed driver
> > * - a PCI interconnect device
> > --
> >
> > Note this check is not related to an IOMMU security context.
>
> Because this is a pre-requisite for imposing that IOMMU security
> context.
>
> > 2) vfio_iommu_group_notifier. When an IOMMU_GROUP_NOTIFY_
> > BOUND_DRIVER event is notified, vfio_group_viable is re-evaluated.
> > If the affected group was previously viable but now becomes not
> > viable, BUG_ON() as it implies that this device is bound to a non-vfio
> > driver which breaks the group isolation.
>
> This notifier action is conditional on there being users of devices
> within a secure group IOMMU context.
>
> > 3) vfio_group_get_device_fd. User can acquire a device fd only after
> > a) the group is viable;
> > b) the group is attached to a container;
> > c) iommu is set on the container (implying a security context
> > established);
>
> The order is actually b) a) c) but arguably b) is a no-op until:
>
> d) a device fd is provided to the user

Per the code in QEMU vfio_get_group(). The order is a) b) c). In
vfio_connect_container(), group is attached to a container.

1959 VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
1960 {
...
1978 group = g_malloc0(sizeof(*group));
1979
1980 snprintf(path, sizeof(path), "/dev/vfio/%d", groupid);
1981 group->fd = qemu_open_old(path, O_RDWR);
1982 if (group->fd < 0) {
1983 error_setg_errno(errp, errno, "failed to open %s", path);
1984 goto free_group_exit;
1985 }
1986
1987 if (ioctl(group->fd, VFIO_GROUP_GET_STATUS, &status)) {
1988 error_setg_errno(errp, errno, "failed to get group %d status", groupid);
1989 goto close_fd_exit;
1990 }
1991
1992 if (!(status.flags & VFIO_GROUP_FLAGS_VIABLE)) {
1993 error_setg(errp, "group %d is not viable", groupid);
1994 error_append_hint(errp,
1995 "Please ensure all devices within the iommu_group "
1996 "are bound to their vfio bus driver.\n");
1997 goto close_fd_exit;
1998 }
1999
2000 group->groupid = groupid;
2001 QLIST_INIT(&group->device_list);
2002
2003 if (vfio_connect_container(group, as, errp)) {
2004 error_prepend(errp, "failed to setup container for group %d: ",
2005 groupid);
2006 goto close_fd_exit;
2007 }
2008
...
2024 }

--
Regards,
Yi Liu