Re: [PATCH] media: v4l2-dev: do not fire driver's release on __video_register_device() failure

From: Guangshuo Li

Date: Sun May 24 2026 - 03:23:34 EST


Hi Hans,

On Wed, 20 May 2026 at 20:45, Hans Verkuil <hverkuil+cisco@xxxxxxxxxx> wrote:
>
> On 20/05/2026 14:41, Laurent Pinchart wrote:
> > On Wed, May 20, 2026 at 03:02:19PM +0300, Sakari Ailus wrote:
> >> On Wed, May 20, 2026 at 01:26:30PM +0200, Hans Verkuil wrote:
> >>> On 20/05/2026 12:48, Laurent Pinchart wrote:
> >>>> On Wed, May 20, 2026 at 12:01:46PM +0200, Hans Verkuil wrote:
> >>>>> On 20/05/2026 11:34, Laurent Pinchart wrote:
> >>>>>> On Wed, May 20, 2026 at 05:06:24PM +0800, Guangshuo Li wrote:
> >>>>>>> video_register_device() / __video_register_device() registers vdev->dev
> >>>>>>> with device_register(). Before the call the video core sets
> >>>>>>>
> >>>>>>> vdev->dev.release = v4l2_device_release;
> >>>>>>>
> >>>>>>> v4l2_device_release() invokes vdev->release(vdev) as its last step, and
> >>>>>>> the driver's vdev->release hook is commonly video_device_release(), which
> >>>>>>> kfree()s the vdev that the driver allocated with video_device_alloc().
> >>>>>>>
> >>>>>>> When device_register() fails inside __video_register_device() the core
> >>>>>>> does
> >>>>>>>
> >>>>>>> put_device(&vdev->dev);
> >>>>>>> return ret;
> >>>>>>>
> >>>>>>> which drops the only reference and fires the v4l2_device_release()
> >>>>>>> chain:
> >>>>>>>
> >>>>>>> __video_register_device()
> >>>>>>> device_register() -> -E*
> >>>>>>> put_device(&vdev->dev)
> >>>>>>> -> v4l2_device_release()
> >>>>>>> -> vdev->release(vdev)
> >>>>>>> -> video_device_release(vdev) /* kfree(vdev), free #1 */
> >>>>>>>
> >>>>>>> video_register_device() returns the error to the driver. Drivers that
> >>>>>>> follow the documented ownership contract release vdev on their own error
> >>>>>>> path, e.g.
> >>>>>>>
> >>>>>>> driver_probe()
> >>>>>>> if (video_register_device(vdev, ...))
> >>>>>>> goto err_release_vdev;
> >>>>>>> ...
> >>>>>>> err_release_vdev:
> >>>>>>> video_device_release(vdev); /* free #2 -- DOUBLE FREE */
> >>>>>>>
> >>>>>>> This is the contract documented in
> >>>>>>> Documentation/driver-api/media/v4l2-dev.rst: the driver owns vdev and
> >>>>>>> is responsible for releasing it if video_register_device() fails. As
> >>>>>>> Hans Verkuil pointed out, the right place to fix this is the v4l2 core
> >>>>>>> rather than every individual driver, because drivers are expected to
> >>>>>>> follow the documented ownership contract.
> >>>>>>>
> >>>>>>> Neutralise vdev->release around put_device() in the device_register()
> >>>>>>> failure path so the device core cleanup does not run the driver's
> >>>>>>> release hook. The driver-supplied release is restored before returning
> >>>>>>> so the caller can release vdev according to the documented contract.
> >>>>>>> Successful registration is unchanged, so the normal teardown sequence
> >>>>>>> continues to call the driver's release hook and free vdev exactly once on
> >>>>>>> unregister.
> >>>>>>>
> >>>>>>> Fixes: 2a934fdb01db ("media: v4l2-dev: fix error handling in __video_register_device()")
> >>>>>>> Signed-off-by: Guangshuo Li <lgs201920130244@xxxxxxxxx>
> >>>>>>> ---
> >>>>>>> drivers/media/v4l2-core/v4l2-dev.c | 5 +++++
> >>>>>>> 1 file changed, 5 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c b/drivers/media/v4l2-core/v4l2-dev.c
> >>>>>>> index 6ce623a1245a..73648549eb2a 100644
> >>>>>>> --- a/drivers/media/v4l2-core/v4l2-dev.c
> >>>>>>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> >>>>>>> @@ -1075,9 +1075,14 @@ int __video_register_device(struct video_device *vdev,
> >>>>>>> mutex_lock(&videodev_lock);
> >>>>>>> ret = device_register(&vdev->dev);
> >>>>>>> if (ret < 0) {
> >>>>>>> + void (*release)(struct video_device *) = vdev->release;
> >>>>>>> +
> >>>>>>> mutex_unlock(&videodev_lock);
> >>>>>>> pr_err("%s: device_register failed\n", __func__);
> >>>>>>> +
> >>>>>>> + vdev->release = video_device_release_empty;
> >>>>>>> put_device(&vdev->dev);
> >>>>>>> + vdev->release = release;
> >>>>>>
> >>>>>> That looks like a big hack. There must be something wrong somewhere else
> >>>>>> in the design.
> >>>>>
> >>>>> There is, unfortunately the design was wrong since the beginning of V4L2.
> >>>>>
> >>>>> Documentation/driver-api/media/v4l2-dev.rst explicitly says that you should use:
> >>>>>
> >>>>> err = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
> >>>>> if (err) {
> >>>>> video_device_release(vdev); /* or kfree(my_vdev); */
> >>>>> return err;
> >>>>> }
> >>>>>
> >>>>> So everyone does that. Luckily device_register never fails in practice (you probably have
> >>>>> bigger problems if it fails then just a double-free).
> >>>>>
> >>>>> The reality is that we don't handle this failure well at all, and this change wouldn't
> >>>>> help in all cases either. E.g. drivers/media/platform/renesas/renesas-ceu.c actually relies
> >>>>> on the release() callback in that it doesn't call video_device_release().
> >>>>>
> >>>>> But then it would fail on the v4l2_err(vdev->v4l2_dev, ...) call since vdev would be freed
> >>>>> already.
> >>>>>
> >>>>> There are probably more drivers like that. (drivers/media/i2c/video-i2c.c)
> >>>>>
> >>>>> I'm not sure what is wisdom here.
> >>>>>
> >>>>> See also commit 2a934fdb01db ("media: v4l2-dev: fix error handling in __video_register_device()"),
> >>>>> which is where the put_device was introduced in the first place.
> >>>>>
> >>>>> Perhaps that should be reverted instead?
> >>>>
> >>>> If we want a short term fix I think that would be better.
> >>>>
> >>>> Have you seen
> >>>> https://lore.kernel.org/all/aeCOdWLaVpH-5w8s@xxxxxxxxxxxxxxxxxxxx/ ?
> >>>
> >>> I hadn't seen it. Interesting.
> >>>
> >>>>
> >>>> Having an API contract different from device_register() will likely
> >>>> cause issues one way or another.
> >>>
> >>> Looking closely how the driver core works and what commit 2a934fdb01db changed,
> >>> I think this might fix it:
> >>>
> >>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c b/drivers/media/v4l2-core/v4l2-dev.c
> >>> index 5516b2bbb08f..6ffd385e880f 100644
> >>> --- a/drivers/media/v4l2-core/v4l2-dev.c
> >>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
> >>> @@ -1071,25 +1071,27 @@ int __video_register_device(struct video_device *vdev,
> >>> vdev->dev.class = &video_class;
> >>> vdev->dev.devt = MKDEV(VIDEO_MAJOR, vdev->minor);
> >>> vdev->dev.parent = vdev->dev_parent;
> >>> - vdev->dev.release = v4l2_device_release;
> >>> dev_set_name(&vdev->dev, "%s%d", name_base, vdev->num);
> >>>
> >>> - /* Increase v4l2_device refcount */
> >>> - v4l2_device_get(vdev->v4l2_dev);
> >>> -
> >>> mutex_lock(&videodev_lock);
> >>> ret = device_register(&vdev->dev);
> >>> if (ret < 0) {
> >>> mutex_unlock(&videodev_lock);
> >>> pr_err("%s: device_register failed\n", __func__);
> >>> put_device(&vdev->dev);
> >>> - return ret;
> >>> + goto cleanup;
> >>> }
> >>> + /* Register the release callback that will be called when the last
> >>> + reference to the device goes away. */
> >>> + vdev->dev.release = v4l2_device_release;
> >>>
> >>> if (nr != -1 && nr != vdev->num && warn_if_nr_in_use)
> >>> pr_warn("%s: requested %s%d, got %s\n", __func__,
> >>> name_base, nr, video_device_node_name(vdev));
> >>>
> >>> + /* Increase v4l2_device refcount */
> >>> + v4l2_device_get(vdev->v4l2_dev);
> >>> +
> >>> /* Part 5: Register the entity. */
> >>> ret = video_register_media_controller(vdev);
> >>>
> >>> This mostly reverts 2a934fdb01db, except that we keep the put_device.
> >>> But when this is called, vdev->dev.release isn't set yet, so it only
> >>> frees the device-related data (device_release in drivers/base/core.c).
> >>>
> >>> Am I missing something?
> >>
> >> I guess this could be workable. This way the caller knows the release
> >> callback won't be called.
> >
I may be missing something, but I think the proposed change could
still cause a driver-core warning.

With `vdev->dev.release = v4l2_device_release` moved after
`device_register()`, the error path would call:

put_device(&vdev->dev);

while `vdev->dev.release` is still NULL. Unless either
`vdev->dev.type->release` or `vdev->dev.class->dev_release` is set,
`device_release()` will fall through to the WARN path:

WARN(1, KERN_ERR "Device '%s' does not have a release() function, it
is broken and must be fixed. ...",
dev_name(dev));

See:
https://codebrowser.dev/linux/linux/drivers/base/core.c.html

So I think the object still needs a valid release path before
`put_device()` can drop the last reference. Delaying
`vdev->dev.release` would avoid calling `v4l2_device_release()` and
thus avoid the double-free, but it may trade that for the driver-core
“no release() function” warning.

Maybe the short-term fix still needs a release callback that keeps the
driver core happy without invoking the driver-provided
`vdev->release()` on the `device_register()` failure path.

> > I think it's a short term hack at best though :-( If the device core
> > requires reference-counting for struct device, with a requirement to
> > call put_device() instead of just freeing the structure, then anything
> > that embeds a struct device should do the same. That means exposing
> > video_put() (which should probably be renamed to video_device_put()) and
> > calling that in the error path, in the caller.
> >
> > We may also want to split video_register_device() into
> > video_device_init() and video_device_add(), but that's a separate
> > question.
>
> I believe that would be the right approach long-term. I actually started
> on that once, but very quickly ran out of time.
>
> Regards,
>
> Hans
>
> >
> >> Regarding error handling, the return value from
> >> video_register_media_controller() is ignored. It'd probably be good to fix
> >> that in a separate patch though. I could submit one as well.
> >
>
Long-term, I agree with Laurent that a `video_device_put()`-style API,
possibly together with an init/add split, would make the lifetime
rules much clearer.

Regards,
Guangshuo