[PATCH v2 0/5] media: uvcvideo: Fix race conditions

From: Guenter Roeck
Date: Tue Sep 08 2020 - 15:46:22 EST

The uvcvideo code has no lock protection against USB disconnects
while video operations are ongoing. This has resulted in random
error reports, typically pointing to a crash in usb_ifnum_to_if(),
called from usb_hcd_alloc_bandwidth(). A typical traceback is as

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
uvc_video_start_streaming+0x91/0xdd [uvcvideo]
uvc_start_streaming+0x28/0x5d [uvcvideo]
vb2_start_streaming+0x61/0x143 [videobuf2_common]
vb2_core_streamon+0xf7/0x10f [videobuf2_common]
uvc_queue_streamon+0x2e/0x41 [uvcvideo]
uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
? video_ioctl2+0x16/0x16

While there are not many references to this problem on mailing lists, it is
reported on a regular basis on various Chromebooks (roughly 300 reports
per month). The problem is relatively easy to reproduce by adding msleep()
calls into the code.

I tried to reproduce the problem with non-uvcvideo webcams, but was
unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
based webcams don't experience the problem, or at least I was unable to
reproduce it (The gspa driver does not trigger sending USB messages in the
open function, and otherwise uses the locking mechanism provided by the
v4l2/vb2 core).

I don't presume to claim that I found every issue, but this patch series
should fix at least the major problems.

The patch series was tested exensively on a Chromebook running chromeos-4.19
and on a Linux system running a v5.8.y based kernel.

- Added details about problem frequency and testing with non-uvc webcams
to summary
- In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
- Fix description in patch 5/5

Guenter Roeck (5):
media: uvcvideo: Cancel async worker earlier
media: uvcvideo: Lock video streams and queues while unregistering
media: uvcvideo: Release stream queue when unregistering video device
media: uvcvideo: Protect uvc queue file operations against disconnect
media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered

drivers/media/usb/uvc/uvc_ctrl.c | 11 ++++++----
drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
drivers/media/usb/uvc/uvc_queue.c | 32 +++++++++++++++++++++++++--
drivers/media/usb/uvc/uvc_v4l2.c | 45 ++++++++++++++++++++++++++++++++++++--
drivers/media/usb/uvc/uvcvideo.h | 1 +
5 files changed, 93 insertions(+), 8 deletions(-)