Re: [BUG] cdev_put() race condition

From: Hans Verkuil
Date: Tue Dec 16 2008 - 05:06:36 EST


On Monday 08 December 2008 21:56:26 Hans Verkuil wrote:
> Hi Greg,
>
> Laurent found a race condition in the uvc driver that we traced to
> the way chrdev_open and cdev_put/get work.
>
> You need the following ingredients to reproduce it:
>
> 1) a hot-pluggable char device like an USB webcam.
> 2) a manually created device node for such a webcam instead of
> relying on udev.
>
> In order to easily force this situation you would also need to add a
> delay to the char device's release() function. For webcams that would
> be at the top of v4l2_chardev_release() in
> drivers/media/video/v4l2-dev.c. But adding a delay to e.g. cdev_purge
> would have the same effect.
>
> The sequence of events in the case of a webcam is as follows:
>
> 1) The USB device is removed, causing a disconnect.
>
> 2) The webcam driver unregisters the video device which in turn calls
> cdev_del().
>
> 3) When the last application using the device is closed, the cdev is
> released when the kref of the cdev's kobject goes to 0.
>
> 4) If the kref's release() call takes a while due to e.g. extra
> cleanup in the case of a webcam, then another application can try to
> open the video device. Note that this requires a device node created
> with mknod, otherwise the device nodes would already have been
> removed by udev.
>
> 5) chrdev_open checks inode->i_cdev. If this is NULL (i.e. this
> device node was never accessed before), then all is fine since
> kobj_lookup will fail because cdev_del() has been called earlier.
> However, if this device node was used earlier, then the else part is
> called:
> cdev_get(p). This 'p' is the cdev that is being released. Since the
> kref count is 0 you will get a WARN message from kref_get, but the
> code continues on, the f_op->open will (hopefully) return
> more-or-less gracefully with an error and the cdev_put at the end
> will cause the refcount to go to 0 again, which results in a SECOND
> call to the kref's release function!
>
> See this link for the original discussion on the v4l list containing
> stack traces an a patch that you need if you want to (and can) test
> this with the uvc driver:
>
> http://www.spinics.net/lists/vfl/msg39967.html
>
> Reading Documentation/kref.txt leads me to the conclusion that a
> mutex should be introduced to prevent cdev_get from being called
> while cdev_put is in progress. It should be a mutex instead of a
> spinlock because the kref's release() can do all sorts of cleanups,
> some of which might involve waits.
>
> I think that replacing cdev_lock with a mutex, adding it to
> cdev_put(), cdev_del() and removing it from cdev_purge() should do
> the trick (I hope).
>
> BTW: shouldn't cdev_del() call cdev_put() instead of kobject_put()?
> If I understand it correctly then cdev_add calls cdev_get (through
> exact_lock()), so cdev_del should do the reverse, right?
>
> Regards,
>
> Hans

Hi Greg,

Will you have time to look at this? I can try to make a patch for this,
but I'd like to get your feedback first.

Regards,

Hans

--
Hans Verkuil - video4linux developer - sponsored by TANDBERG
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/