Re: [PATCH] media: v4l2-dev: do not fire driver's release on __video_register_device() failure

From: Hans Verkuil

Date: Wed May 20 2026 - 08:45:16 EST


On 20/05/2026 14:41, Laurent Pinchart wrote:
> On Wed, May 20, 2026 at 03:02:19PM +0300, Sakari Ailus wrote:
>> On Wed, May 20, 2026 at 01:26:30PM +0200, Hans Verkuil wrote:
>>> On 20/05/2026 12:48, Laurent Pinchart wrote:
>>>> On Wed, May 20, 2026 at 12:01:46PM +0200, Hans Verkuil wrote:
>>>>> On 20/05/2026 11:34, Laurent Pinchart wrote:
>>>>>> On Wed, May 20, 2026 at 05:06:24PM +0800, Guangshuo Li wrote:
>>>>>>> video_register_device() / __video_register_device() registers vdev->dev
>>>>>>> with device_register(). Before the call the video core sets
>>>>>>>
>>>>>>> vdev->dev.release = v4l2_device_release;
>>>>>>>
>>>>>>> v4l2_device_release() invokes vdev->release(vdev) as its last step, and
>>>>>>> the driver's vdev->release hook is commonly video_device_release(), which
>>>>>>> kfree()s the vdev that the driver allocated with video_device_alloc().
>>>>>>>
>>>>>>> When device_register() fails inside __video_register_device() the core
>>>>>>> does
>>>>>>>
>>>>>>> put_device(&vdev->dev);
>>>>>>> return ret;
>>>>>>>
>>>>>>> which drops the only reference and fires the v4l2_device_release()
>>>>>>> chain:
>>>>>>>
>>>>>>> __video_register_device()
>>>>>>> device_register() -> -E*
>>>>>>> put_device(&vdev->dev)
>>>>>>> -> v4l2_device_release()
>>>>>>> -> vdev->release(vdev)
>>>>>>> -> video_device_release(vdev) /* kfree(vdev), free #1 */
>>>>>>>
>>>>>>> video_register_device() returns the error to the driver. Drivers that
>>>>>>> follow the documented ownership contract release vdev on their own error
>>>>>>> path, e.g.
>>>>>>>
>>>>>>> driver_probe()
>>>>>>> if (video_register_device(vdev, ...))
>>>>>>> goto err_release_vdev;
>>>>>>> ...
>>>>>>> err_release_vdev:
>>>>>>> video_device_release(vdev); /* free #2 -- DOUBLE FREE */
>>>>>>>
>>>>>>> This is the contract documented in
>>>>>>> Documentation/driver-api/media/v4l2-dev.rst: the driver owns vdev and
>>>>>>> is responsible for releasing it if video_register_device() fails. As
>>>>>>> Hans Verkuil pointed out, the right place to fix this is the v4l2 core
>>>>>>> rather than every individual driver, because drivers are expected to
>>>>>>> follow the documented ownership contract.
>>>>>>>
>>>>>>> Neutralise vdev->release around put_device() in the device_register()
>>>>>>> failure path so the device core cleanup does not run the driver's
>>>>>>> release hook. The driver-supplied release is restored before returning
>>>>>>> so the caller can release vdev according to the documented contract.
>>>>>>> Successful registration is unchanged, so the normal teardown sequence
>>>>>>> continues to call the driver's release hook and free vdev exactly once on
>>>>>>> unregister.
>>>>>>>
>>>>>>> Fixes: 2a934fdb01db ("media: v4l2-dev: fix error handling in __video_register_device()")
>>>>>>> Signed-off-by: Guangshuo Li <lgs201920130244@xxxxxxxxx>
>>>>>>> ---
>>>>>>> drivers/media/v4l2-core/v4l2-dev.c | 5 +++++
>>>>>>> 1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c b/drivers/media/v4l2-core/v4l2-dev.c
>>>>>>> index 6ce623a1245a..73648549eb2a 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-dev.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
>>>>>>> @@ -1075,9 +1075,14 @@ int __video_register_device(struct video_device *vdev,
>>>>>>> mutex_lock(&videodev_lock);
>>>>>>> ret = device_register(&vdev->dev);
>>>>>>> if (ret < 0) {
>>>>>>> + void (*release)(struct video_device *) = vdev->release;
>>>>>>> +
>>>>>>> mutex_unlock(&videodev_lock);
>>>>>>> pr_err("%s: device_register failed\n", __func__);
>>>>>>> +
>>>>>>> + vdev->release = video_device_release_empty;
>>>>>>> put_device(&vdev->dev);
>>>>>>> + vdev->release = release;
>>>>>>
>>>>>> That looks like a big hack. There must be something wrong somewhere else
>>>>>> in the design.
>>>>>
>>>>> There is, unfortunately the design was wrong since the beginning of V4L2.
>>>>>
>>>>> Documentation/driver-api/media/v4l2-dev.rst explicitly says that you should use:
>>>>>
>>>>> err = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
>>>>> if (err) {
>>>>> video_device_release(vdev); /* or kfree(my_vdev); */
>>>>> return err;
>>>>> }
>>>>>
>>>>> So everyone does that. Luckily device_register never fails in practice (you probably have
>>>>> bigger problems if it fails then just a double-free).
>>>>>
>>>>> The reality is that we don't handle this failure well at all, and this change wouldn't
>>>>> help in all cases either. E.g. drivers/media/platform/renesas/renesas-ceu.c actually relies
>>>>> on the release() callback in that it doesn't call video_device_release().
>>>>>
>>>>> But then it would fail on the v4l2_err(vdev->v4l2_dev, ...) call since vdev would be freed
>>>>> already.
>>>>>
>>>>> There are probably more drivers like that. (drivers/media/i2c/video-i2c.c)
>>>>>
>>>>> I'm not sure what is wisdom here.
>>>>>
>>>>> See also commit 2a934fdb01db ("media: v4l2-dev: fix error handling in __video_register_device()"),
>>>>> which is where the put_device was introduced in the first place.
>>>>>
>>>>> Perhaps that should be reverted instead?
>>>>
>>>> If we want a short term fix I think that would be better.
>>>>
>>>> Have you seen
>>>> https://lore.kernel.org/all/aeCOdWLaVpH-5w8s@xxxxxxxxxxxxxxxxxxxx/ ?
>>>
>>> I hadn't seen it. Interesting.
>>>
>>>>
>>>> Having an API contract different from device_register() will likely
>>>> cause issues one way or another.
>>>
>>> Looking closely how the driver core works and what commit 2a934fdb01db changed,
>>> I think this might fix it:
>>>
>>> diff --git a/drivers/media/v4l2-core/v4l2-dev.c b/drivers/media/v4l2-core/v4l2-dev.c
>>> index 5516b2bbb08f..6ffd385e880f 100644
>>> --- a/drivers/media/v4l2-core/v4l2-dev.c
>>> +++ b/drivers/media/v4l2-core/v4l2-dev.c
>>> @@ -1071,25 +1071,27 @@ int __video_register_device(struct video_device *vdev,
>>> vdev->dev.class = &video_class;
>>> vdev->dev.devt = MKDEV(VIDEO_MAJOR, vdev->minor);
>>> vdev->dev.parent = vdev->dev_parent;
>>> - vdev->dev.release = v4l2_device_release;
>>> dev_set_name(&vdev->dev, "%s%d", name_base, vdev->num);
>>>
>>> - /* Increase v4l2_device refcount */
>>> - v4l2_device_get(vdev->v4l2_dev);
>>> -
>>> mutex_lock(&videodev_lock);
>>> ret = device_register(&vdev->dev);
>>> if (ret < 0) {
>>> mutex_unlock(&videodev_lock);
>>> pr_err("%s: device_register failed\n", __func__);
>>> put_device(&vdev->dev);
>>> - return ret;
>>> + goto cleanup;
>>> }
>>> + /* Register the release callback that will be called when the last
>>> + reference to the device goes away. */
>>> + vdev->dev.release = v4l2_device_release;
>>>
>>> if (nr != -1 && nr != vdev->num && warn_if_nr_in_use)
>>> pr_warn("%s: requested %s%d, got %s\n", __func__,
>>> name_base, nr, video_device_node_name(vdev));
>>>
>>> + /* Increase v4l2_device refcount */
>>> + v4l2_device_get(vdev->v4l2_dev);
>>> +
>>> /* Part 5: Register the entity. */
>>> ret = video_register_media_controller(vdev);
>>>
>>> This mostly reverts 2a934fdb01db, except that we keep the put_device.
>>> But when this is called, vdev->dev.release isn't set yet, so it only
>>> frees the device-related data (device_release in drivers/base/core.c).
>>>
>>> Am I missing something?
>>
>> I guess this could be workable. This way the caller knows the release
>> callback won't be called.
>
> I think it's a short term hack at best though :-( If the device core
> requires reference-counting for struct device, with a requirement to
> call put_device() instead of just freeing the structure, then anything
> that embeds a struct device should do the same. That means exposing
> video_put() (which should probably be renamed to video_device_put()) and
> calling that in the error path, in the caller.
>
> We may also want to split video_register_device() into
> video_device_init() and video_device_add(), but that's a separate
> question.

I believe that would be the right approach long-term. I actually started
on that once, but very quickly ran out of time.

Regards,

Hans

>
>> Regarding error handling, the return value from
>> video_register_media_controller() is ignored. It'd probably be good to fix
>> that in a separate patch though. I could submit one as well.
>