Re: [PATCHv2 09/10] vfio/mdev: Avoid creating sysfs remove file on stale device removal
From: Cornelia Huck
Date: Wed May 08 2019 - 13:17:49 EST
On Tue, 30 Apr 2019 17:49:36 -0500
Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> If device is removal is initiated by two threads as below, mdev core
> attempts to create a syfs remove file on stale device.
> During this flow, below [1] call trace is observed.
>
> cpu-0 cpu-1
> ----- -----
> mdev_unregister_device()
> device_for_each_child
> mdev_device_remove_cb
> mdev_device_remove
> user_syscall
> remove_store()
> mdev_device_remove()
> [..]
> unregister device();
> /* not found in list or
> * active=false.
> */
> sysfs_create_file()
> ..Call trace
>
> Now that mdev core follows correct device removal system of the linux
> bus model, remove shouldn't fail in normal cases. If it fails, there is
> no point of creating a stale file or checking for specific error status.
Which error cases are left? Is there anything that does not indicate
that something got terribly messed up internally?
>
> kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327
> sysfs_create_file_ns+0x7f/0x90
> kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted
> 5.1.0-rc6-vdevbus+ #6
> kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b
> 08/09/2016
> kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
> kernel: Call Trace:
> kernel: remove_store+0xdc/0x100 [mdev]
> kernel: kernfs_fop_write+0x113/0x1a0
> kernel: vfs_write+0xad/0x1b0
> kernel: ksys_write+0x5a/0xe0
> kernel: do_syscall_64+0x5a/0x210
> kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> ---
> drivers/vfio/mdev/mdev_sysfs.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
> index 9f774b91d275..ffa3dcebf201 100644
> --- a/drivers/vfio/mdev/mdev_sysfs.c
> +++ b/drivers/vfio/mdev/mdev_sysfs.c
> @@ -237,10 +237,8 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
> int ret;
>
> ret = mdev_device_remove(dev);
> - if (ret) {
> - device_create_file(dev, attr);
> + if (ret)
Should you merge this into the previous patch?
> return ret;
> - }
> }
>
> return count;