Re: [syzbot] [perf?] KCSAN: data-race in _free_event / perf_pending_task (2)

From: Dmitry Vyukov
Date: Mon Oct 14 2024 - 06:31:00 EST


On Mon, 14 Oct 2024 at 11:41, Marco Elver <elver@xxxxxxxxxx> wrote:
>
> On Mon, 14 Oct 2024 at 10:30, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > On Mon, 14 Oct 2024 at 08:09, syzbot
> > <syzbot+e75157f5b04f8ff40e17@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 87d6aab2389e Merge tag 'for_linus' of git://git.kernel.org..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=10104f9f980000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=a2f7ae2f221e9eae
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=e75157f5b04f8ff40e17
> > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/cce40536bdc3/disk-87d6aab2.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/479edc06c8d8/vmlinux-87d6aab2.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/9d377c65ffca/bzImage-87d6aab2.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+e75157f5b04f8ff40e17@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in _free_event / perf_pending_task
> > >
> > > write to 0xffff8881155361e8 of 4 bytes by task 9574 on cpu 1:
> > > perf_pending_task+0xe8/0x220 kernel/events/core.c:6976
> > > task_work_run+0x13a/0x1a0 kernel/task_work.c:228
> > > resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
> > > exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
> > > exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
> > > __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> > > syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218
> > > do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > >
> > > read to 0xffff8881155361e8 of 4 bytes by task 9573 on cpu 0:
> > > perf_pending_task_sync kernel/events/core.c:5302 [inline]
> >
> > +Marco, IIRC we assumed event->pending_work should only be accessed by
> > the owner task.
> > Here it's concurrently modified/changed. The other task calls
> > task_work_cancel on current based on the valus of this field, this
> > looks bad.
>
> On freeing an event it can be any other task, AFAIK. There's this comment:
>
> > * All accesses related to the event are within the same RCU section in
> > * perf_pending_task(). The RCU grace period before the event is freed
> > * will make sure all those accesses are complete by then.
>
> So there is no risk of UAF.
>
> All that may happen is that on concurrent free of an event with a
> pending SIGTRAP, it's possible the SIGTRAP may or may not be
> delivered. But I think that's perfectly reasonable if the event is in
> the process of being closed.
>
> Did I miss something?

I have not recap all logic, but what looked suspicious on the first glance:

The task doing _free_event->perf_pending_task_sync is not the owner of
the event (the other task is the owner?).

But that task is later using 'current' to do something with regard to
this event:

/*
* If the task is queued to the current task's queue, we
* obviously can't wait for it to complete. Simply cancel it.
*/
if (task_work_cancel(current, head)) {

Is this current wrong here? So it may both not cancel it for the real
owner, and cancel something else for itself (?).