Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap

From: Haocheng Yu

Date: Sat Mar 07 2026 - 00:57:54 EST

But if it is moved out, other events might be able to hold the mutex after the
current event finishes executing refcount_dec but before releasing rb, thus
causing the race condition I mentioned in the patch.

The following is a more detailed analysis:

In the C reproducer, these four system calls are central to this problem.
Specifically, the third syscall takes r0 as an argument(group), establishing
a shared ring buffer. The fourth syscall uses an unusual flag combination,
which is likely the reason this bug can be triggered.

res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0,
/*cpu=*/1ul, /*group=*/(intptr_t)-1,
/*flags=PERF_FLAG_FD_CLOEXEC*/ 8ul);
syscall(__NR_mmap, /*addr=*/0x200000002000ul, /*len=*/0x1000ul, /*prot=*/0ul,
/*flags=MAP_FIXED|MAP_SHARED*/ 0x11ul, /*fd=*/r[0], /*offset=*/0ul);

res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0,
/*cpu=*/1ul, /*group=*/r[0], /*flags=PERF_FLAG_FD_OUTPUT*/ 2ul);
syscall(__NR_mmap, /*addr=*/0x200000186000ul, /*len=*/0x1000ul,
/*prot=PROT_GROWSDOWN|PROT_SEM|PROT_WRITE|PROT_READ*/ 0x100000bul,
/*flags=MAP_SHARED_VALIDATE|MAP_FIXED*/ 0x13ul, /*fd=*/r[1],
/*offset=*/0ul);

And that's what happened: r0 enters perf_mmap first. It acquires the mutex,
executes perf_mmap_rb, releases the mutex, and then calls map_range. If
map_range fails, the function enters perf_mmap_close, which drops the
refcount to 0 and finally releases the rb.

At an exact moment(after decreasing refcount but before releasing rb), r1 also
enters perf_mmap and attempts to attach to r0's ring buffer. Because the mutex
is released, the second mmap can acquire the mutex and access the rb pointer
which is shared with r0 before it is cleared, attempting to increment
the refcount
on a buffer that is already being destroyed.

> On Fri, Mar 6, 2026 at 1:37 AM Haocheng Yu <yuhaocheng035@xxxxxxxxx> wrote:
> >
> > That makes a lot of sense. It's indeed possible for a self deadlock to occur.
> >
> > I tried updating my patch by modifying `perf_mmap_close` to get a
> > `perf_mmap_close_locked`
> > function that handles the case where event->mutex is held from start.
> > But this approach
> > isn't very concise, and I'm not so sure if I changed the original
> > logic for some unexpected reasons.
> > Nevertheless, releasing the mutex before perf_mmap_close finishes
> > executing might cause the
> > original race condition issue again, which puts me in a dilemma.
> >
> > Do you have any suggestions?
>
> With the:
> ```
> + if (ret)
> + perf_mmap_close_locked(vma);
> ```
> Wouldn't moving it outside the "scoped_guard(mutex,
> &event->mmap_mutex)" be a fix?
>
> Thanks,
> Ian
>
> > Thanks,
> > Haocheng
> >
> >
> >
> > > On Mon, Feb 2, 2026 at 8:30 AM <yuhaocheng035@xxxxxxxxx> wrote:
> > > >
> > > > From: Haocheng Yu <yuhaocheng035@xxxxxxxxx>
> > > >
> > > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning
> > > > in perf_mmap.
> > > >
> > > > The issue is caused by a race condition between a failing mmap() setup
> > > > and a concurrent mmap() on a dependent event (e.g., using output
> > > > redirection).
> > > >
> > > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to
> > > > event->rb with the mmap_mutex held. The mutex is then released to
> > > > perform map_range().
> > > >
> > > > If map_range() fails, perf_mmap_close() is called to clean up.
> > > > However, since the mutex was dropped, another thread attaching to
> > > > this event (via inherited events or output redirection) can acquire
> > > > the mutex, observe the valid event->rb pointer, and attempt to
> > > > increment its reference count. If the cleanup path has already
> > > > dropped the reference count to zero, this results in a
> > > > use-after-free or refcount saturation warning.
> > > >
> > > > Fix this by extending the scope of mmap_mutex to cover the
> > > > map_range() call. This ensures that the ring buffer initialization
> > > > and mapping (or cleanup on failure) happens atomically effectively,
> > > > preventing other threads from accessing a half-initialized or
> > > > dying ring buffer.
> > >
> > > As perf_mmap_close is now called inside the guarded region, is there
> > > potential for self deadlock?
> > >
> > > In perf_mmap it is now calling perf_mmap_close holding the event->mmap_mutex:
> > > ```
> > > scoped_guard (mutex, &event->mmap_mutex) {
> > > [...]
> > > ret = map_range(event->rb, vma);
> > > if (ret)
> > > perf_mmap_close(vma);
> > > }
> > > ```
> > > and in perf_mmap_close the mutex will be taken again:
> > > ```
> > > static void perf_mmap_close(struct vm_area_struct *vma)
> > > {
> > > struct perf_event *event = vma->vm_file->private_data;
> > > [...]
> > > if (!refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> > > goto out_put;
> > > ```
> > >
> > > Thanks,
> > > Ian
> > >
> > > > Reported-by: kernel test robot <lkp@xxxxxxxxx>
> > > > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@xxxxxxxxx/
> > > > Signed-off-by: Haocheng Yu <yuhaocheng035@xxxxxxxxx>
> > > > ---
> > > > kernel/events/core.c | 38 +++++++++++++++++++-------------------
> > > > 1 file changed, 19 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 2c35acc2722b..abefd1213582 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
> > > > ret = perf_mmap_aux(vma, event, nr_pages);
> > > > if (ret)
> > > > return ret;
> > > > - }
> > > >
> > > > - /*
> > > > - * Since pinned accounting is per vm we cannot allow fork() to copy our
> > > > - * vma.
> > > > - */
> > > > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP);
> > > > - vma->vm_ops = &perf_mmap_vmops;
> > > > + /*
> > > > + * Since pinned accounting is per vm we cannot allow fork() to copy our
> > > > + * vma.
> > > > + */
> > > > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP);
> > > > + vma->vm_ops = &perf_mmap_vmops;
> > > >
> > > > - mapped = get_mapped(event, event_mapped);
> > > > - if (mapped)
> > > > - mapped(event, vma->vm_mm);
> > > > + mapped = get_mapped(event, event_mapped);
> > > > + if (mapped)
> > > > + mapped(event, vma->vm_mm);
> > > >
> > > > - /*
> > > > - * Try to map it into the page table. On fail, invoke
> > > > - * perf_mmap_close() to undo the above, as the callsite expects
> > > > - * full cleanup in this case and therefore does not invoke
> > > > - * vmops::close().
> > > > - */
> > > > - ret = map_range(event->rb, vma);
> > > > - if (ret)
> > > > - perf_mmap_close(vma);
> > > > + /*
> > > > + * Try to map it into the page table. On fail, invoke
> > > > + * perf_mmap_close() to undo the above, as the callsite expects
> > > > + * full cleanup in this case and therefore does not invoke
> > > > + * vmops::close().
> > > > + */
> > > > + ret = map_range(event->rb, vma);
> > > > + if (ret)
> > > > + perf_mmap_close(vma);
> > > > + }
> > > >
> > > > return ret;
> > > > }
> > > >
> > > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449
> > > > --
> > > > 2.51.0
> > > >
> > > >