RE: [PATCHv2 09/10] vfio/mdev: Avoid creating sysfs remove file on stale device removal

From: Parav Pandit
Date: Thu May 09 2019 - 12:17:45 EST




> -----Original Message-----
> From: Cornelia Huck <cohuck@xxxxxxxxxx>
> Sent: Thursday, May 9, 2019 4:18 AM
> To: Parav Pandit <parav@xxxxxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> kwankhede@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; cjia@xxxxxxxxxx
> Subject: Re: [PATCHv2 09/10] vfio/mdev: Avoid creating sysfs remove file on
> stale device removal
>
> On Wed, 8 May 2019 22:13:28 +0000
> Parav Pandit <parav@xxxxxxxxxxxx> wrote:
>
> > > -----Original Message-----
> > > From: Cornelia Huck <cohuck@xxxxxxxxxx>
> > > Sent: Wednesday, May 8, 2019 12:17 PM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > kwankhede@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; cjia@xxxxxxxxxx
> > > Subject: Re: [PATCHv2 09/10] vfio/mdev: Avoid creating sysfs remove
> > > file on stale device removal
> > >
> > > On Tue, 30 Apr 2019 17:49:36 -0500
> > > Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> > >
> > > > If device is removal is initiated by two threads as below, mdev
> > > > core attempts to create a syfs remove file on stale device.
> > > > During this flow, below [1] call trace is observed.
> > > >
> > > > cpu-0 cpu-1
> > > > ----- -----
> > > > mdev_unregister_device()
> > > > device_for_each_child
> > > > mdev_device_remove_cb
> > > > mdev_device_remove
> > > > user_syscall
> > > > remove_store()
> > > > mdev_device_remove()
> > > > [..]
> > > > unregister device();
> > > > /* not found in list or
> > > > * active=false.
> > > > */
> > > > sysfs_create_file()
> > > > ..Call trace
> > > >
> > > > Now that mdev core follows correct device removal system of the
> > > > linux bus model, remove shouldn't fail in normal cases. If it
> > > > fails, there is no point of creating a stale file or checking for specific
> error status.
> > >
> > > Which error cases are left? Is there anything that does not indicate
> > > that something got terribly messed up internally?
> > >
> > Few reasons I can think of that can fail remove are:
> >
> > 1. Some device removal requires allocating memory too as it needs to issue
> commands to device.
> > If on the path, such allocation fails, remove can fail. However such fail to
> allocate memory will probably result into more serious warnings before this.
>
> Nod. If we're OOM, we probably have some bigger problems anyway.
>
> > 2. if the device firmware has crashed, device removal commands will likely
> timeout and return such error upto user.
>
> In that case, I'd consider the device pretty much unusable in any case.
>
Right.

> > 3. If user tries to remove a device, while parent is already in removal path,
> this call will eventually fail as it won't find the device in the internal list.
>
> This should be benign, I think.
>
Right.

> >
> > > >
> > > > kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327
> > > > sysfs_create_file_ns+0x7f/0x90
> > > > kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted
> > > > 5.1.0-rc6-vdevbus+ #6
> > > > kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS
> > > > 2.0b
> > > > 08/09/2016
> > > > kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
> > > > kernel: Call Trace:
> > > > kernel: remove_store+0xdc/0x100 [mdev]
> > > > kernel: kernfs_fop_write+0x113/0x1a0
> > > > kernel: vfs_write+0xad/0x1b0
> > > > kernel: ksys_write+0x5a/0xe0
> > > > kernel: do_syscall_64+0x5a/0x210
> > > > kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > >
> > > > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> > > > ---
> > > > drivers/vfio/mdev/mdev_sysfs.c | 4 +---
> > > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/vfio/mdev/mdev_sysfs.c
> > > > b/drivers/vfio/mdev/mdev_sysfs.c index 9f774b91d275..ffa3dcebf201
> > > > 100644
> > > > --- a/drivers/vfio/mdev/mdev_sysfs.c
> > > > +++ b/drivers/vfio/mdev/mdev_sysfs.c
> > > > @@ -237,10 +237,8 @@ static ssize_t remove_store(struct device
> > > > *dev,
> > > struct device_attribute *attr,
> > > > int ret;
> > > >
> > > > ret = mdev_device_remove(dev);
> > > > - if (ret) {
> > > > - device_create_file(dev, attr);
> > > > + if (ret)
> > >
> > > Should you merge this into the previous patch?
> > >
> > I am not sure. Previous patch changes the sequence. I think that deserved
> an own patch by itself.
> > This change is making use of that sequence.
> > So its easier to review? Alex had comment in v0 to split into more logical
> patches, so...
> > Specially to capture a different call trace, I cut into different patch.
> > Otherwise previous patch's commit message is too long.
>
> I'm not sure if splitting out this one is worth it... your call.
>
Ok. either ways...
> >
> > > > return ret;
> > > > - }
> > > > }
> > > >
> > > > return count;
> >