Re: KASAN: null-ptr-deref Write in event_handler

From: Shuah Khan
Date: Wed Oct 07 2020 - 11:48:08 EST


On 10/7/20 8:28 AM, Andrey Konovalov wrote:
On Wed, Oct 7, 2020 at 3:56 PM Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx> wrote:

On 10/5/20 2:44 PM, Shuah Khan wrote:
On 10/5/20 8:04 AM, Andrey Konovalov wrote:
On Mon, Oct 5, 2020 at 3:59 PM syzbot
<syzbot+bf1a360e305ee719e364@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Hello,

syzbot found the following issue on:

HEAD commit: d3d45f82 Merge tag 'pinctrl-v5.9-2' of
git://git.kernel.or..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15781d8f900000
kernel config:
https://syzkaller.appspot.com/x/.config?x=89ab6a0c48f30b49
dashboard link:
https://syzkaller.appspot.com/bug?extid=bf1a360e305ee719e364
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro:
https://syzkaller.appspot.com/x/repro.syz?x=16cbaa7d900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1364f367900000

IMPORTANT: if you fix the issue, please add the following tag to the
commit:
Reported-by: syzbot+bf1a360e305ee719e364@xxxxxxxxxxxxxxxxxxxxxxxxx

vhci_hcd: stop threads
vhci_hcd: release socket
vhci_hcd: disconnect device
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_write
include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic_fetch_add_relaxed
include/asm-generic/atomic-instrumented.h:142 [inline]
BUG: KASAN: null-ptr-deref in refcount_add
include/linux/refcount.h:201 [inline]
BUG: KASAN: null-ptr-deref in refcount_inc
include/linux/refcount.h:241 [inline]
BUG: KASAN: null-ptr-deref in get_task_struct
include/linux/sched/task.h:104 [inline]
BUG: KASAN: null-ptr-deref in kthread_stop+0x90/0x7e0
kernel/kthread.c:591
Write of size 4 at addr 000000000000001c by task kworker/u4:5/2519

CPU: 1 PID: 2519 Comm: kworker/u4:5 Not tainted 5.9.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: usbip_event event_handler
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
__kasan_report mm/kasan/report.c:517 [inline]
kasan_report.cold+0x5/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
instrument_atomic_write include/linux/instrumented.h:71 [inline]
atomic_fetch_add_relaxed
include/asm-generic/atomic-instrumented.h:142 [inline]
refcount_add include/linux/refcount.h:201 [inline]
refcount_inc include/linux/refcount.h:241 [inline]
get_task_struct include/linux/sched/task.h:104 [inline]
kthread_stop+0x90/0x7e0 kernel/kthread.c:591
vhci_shutdown_connection+0x170/0x2a0 drivers/usb/usbip/vhci_hcd.c:1015
event_handler+0x1a5/0x450 drivers/usb/usbip/usbip_event.c:78
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 2519 Comm: kworker/u4:5 Tainted: G B
5.9.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: usbip_event event_handler
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
panic+0x382/0x7fb kernel/panic.c:231
end_report+0x4d/0x53 mm/kasan/report.c:104
__kasan_report mm/kasan/report.c:520 [inline]
kasan_report.cold+0xd/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
instrument_atomic_write include/linux/instrumented.h:71 [inline]
atomic_fetch_add_relaxed
include/asm-generic/atomic-instrumented.h:142 [inline]
refcount_add include/linux/refcount.h:201 [inline]
refcount_inc include/linux/refcount.h:241 [inline]
get_task_struct include/linux/sched/task.h:104 [inline]
kthread_stop+0x90/0x7e0 kernel/kthread.c:591
vhci_shutdown_connection+0x170/0x2a0 drivers/usb/usbip/vhci_hcd.c:1015
event_handler+0x1a5/0x450 drivers/usb/usbip/usbip_event.c:78
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Kernel Offset: disabled
Rebooting in 86400 seconds..

Hi Valentina and Shuah,

There appears to be a race condition in the USB/IP vhci_hcd shutdown
procedure. It happens quite often during fuzzing with syzkaller, and
prevents us from going deeper into the USB/IP code.

Could you advise us what would be the best fix for this?


Hi Andrey,

Reading the comments for this routine, looks like there is an assumption
that context begins cleanup and race conditions aren't considered.

The right fix is holding vhci->lock and vdev->priv_lock to protect
critical sections in this routine. I will send a patch for this.


Hi Andrey,

I have been unable to reproduce the problem with the reproducer
so far. You mentioned it happens quite often.

- matched config with yours
- load vhci_hcd module and run the reproducer

Hm, if you matched the config, then the module should be built-in?


Right. I did notice that your config has built-in. This shouldn't
matter, I have a kernel built with it static. I will try it to
see if it makes a difference.


I do see the messages during shutdown - stop threads etc.

What am I missing?

This appears to be a race that requires precise timings. I failed to
reproduce it with the C reproducer, but I managed to reproduce it with
the syzkaller repro program:

https://syzkaller.appspot.com/x/repro.syz?x=16cbaa7d900000

To do that you need to build syzkaller, and copy ./bin/syz-execprog
and ./bin/syz-executor into your testing environment, and then do:

./syz-execprog -sandbox=none -repeat=0 -procs=6 ./repro.prog


Thanks for the tips on your environment.

thanks,
-- Shuah