Re: [PATCH 3/5] iommu/s390: Use RCU to allow concurrent domain_list iteration

From: Niklas Schnelle
Date: Thu Oct 27 2022 - 09:38:06 EST


On Thu, 2022-10-27 at 09:56 -0300, Jason Gunthorpe wrote:
> On Thu, Oct 27, 2022 at 02:44:49PM +0200, Niklas Schnelle wrote:
> > On Mon, 2022-10-24 at 13:26 -0300, Jason Gunthorpe wrote:
> > > On Mon, Oct 24, 2022 at 05:22:24PM +0200, Niklas Schnelle wrote:
> > >
> > > > Thanks for the explanation, still would like to grok this a bit more if
> > > > you don't mind. If I do read things correctly synchronize_rcu() should
> > > > run in the conext of the VFIO ioctl in this case and shouldn't block
> > > > anything else in the kernel, correct? At least that's how I understand
> > > > the synchronize_rcu() comments and the fact that e.g.
> > > > net/vmw_vsock/virtio_transport.c:virtio_vsock_remove() also does a
> > > > synchronize_rcu() and can be triggered from user-space too.
> > >
> > > Yes, but I wouldn't look in the kernel to understand if things are OK
> > >
> > > > So we're
> > > > more worried about user-space getting slowed down rather than a Denial-
> > > > of-Service against other kernel tasks.
> > >
> > > Yes, functionally it is OK, but for something like vfio with vIOMMU
> > > you could be looking at several domains that have to be detached
> > > sequentially and with grace periods > 1s you can reach multiple
> > > seconds to complete something like a close() system call. Generally it
> > > should be weighed carefully
> > >
> > > Jason
> >
> > Thanks for the detailed explanation. Then let's not put a
> > synchronize_rcu() in detach, as I said as long as the I/O translation
> > tables are there an IOTLB flush after zpci_unregister_ioat() should
> > result in an ignorable error. That said, I think if we don't have the
> > synchronize_rcu() in detach we need it in s390_domain_free() before
> > freeing the I/O translation tables.
>
> Yes, it would be appropriate to free those using one of the rcu
> free'rs, (eg kfree_rcu) not synchronize_rcu()
>
> Jason

They are allocated via kmem_cache_alloc() from caches shared by all
IOMMU's so can't use kfree_rcu() directly. Also we're only freeing the
entire I/O translation table of one IOMMU at once after it is not used
anymore. Before that it is only grown. So I think synchronize_rcu() is
the obvious and simple choice since we only need one grace period.