RE: [PATCH] vfio/mdev: fix possible memory leak in module init funcs

From: Tian, Kevin
Date: Wed Nov 30 2022 - 21:01:06 EST


> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Thursday, December 1, 2022 7:06 AM
>
> [Cc +vfio-ap, vfio-ccw]
>
> On Fri, 18 Nov 2022 11:28:27 +0800
> ruanjinjie <ruanjinjie@xxxxxxxxxx> wrote:
>
> > Inject fault while probing module, if device_register() fails,
> > but the refcount of kobject is not decreased to 0, the name
> > allocated in dev_set_name() is leaked. Fix this by calling
> > put_device(), so that name can be freed in callback function
> > kobject_cleanup().

It's not just about the name. The problem of kboject not being
released is a bigger one.

put_device() is always required no matter device_register()
succeeds or not:

* NOTE: _Never_ directly free @dev after calling this function, even
* if it returned an error! Always use put_device() to give up the
* reference initialized in this function instead.
*/
int device_register(struct device *dev)

> > @@ -1430,8 +1430,10 @@ static int __init mbochs_dev_init(void)
> > dev_set_name(&mbochs_dev, "%s", MBOCHS_NAME);
> >
> > ret = device_register(&mbochs_dev);
> > - if (ret)
> > + if (ret) {
> > + put_device(&mbochs_dev);
> > goto err_class;
> > + }
> >
> > ret = mdev_register_parent(&mbochs_parent, &mbochs_dev,
> &mbochs_driver,
> > mbochs_mdev_types,
>
>
> vfio-ap has a similar unwind as the sample drivers, but actually makes
> an attempt to catch this ex:
>
> ...
> ret = device_register(&matrix_dev->device);
> if (ret)
> goto matrix_reg_err;
>
> ret = driver_register(&matrix_driver);
> if (ret)
> goto matrix_drv_err;
>
> return 0;
>
> matrix_drv_err:
> device_unregister(&matrix_dev->device);
> matrix_reg_err:
> put_device(&matrix_dev->device);
> ...
>
> So of the vfio drivers calling device_register(), vfio-ap is the only
> one that does a put_device() if device_register() fails, but it also
> seems sketchy to call both device_unregister() and put_device() in the
> case that we exit via matrix_drv_err.
>
> I wonder if all of these shouldn't adopt a flow like:
>
> ret = device_register(&dev);
> if (ret)
> goto err1;
>
> ....
>
> return 0;
>
> err2:
> device_del(&dev);
> err1:
> put_device(&dev);
>

It's kind of a mixed model.

With above unwind it's clearer to use device_initialize() and device_add() instead.

Otherwise what this patch does looks better IMHO:

ret = device_register(&dev);
if (ret) {
put_device(&dev);
goto err1;
}

...

return 0;

err2:
device_unregister(&dev);
err1:
earlier_unwind();