Re: [PATCH] perf: Fix deadlock in perf_mmap()

From: Ian Rogers

Date: Tue Mar 24 2026 - 14:39:43 EST


On Mon, Mar 9, 2026 at 9:45 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Mon, Mar 9, 2026 at 8:37 PM Qing Wang <wangqing7171@xxxxxxxxx> wrote:
> >
> > On Tue, 10 Mar 2026 at 02:59, Ian Rogers <irogers@xxxxxxxxxx> wrote:
> > > Hi Qing, thank you for looking into this. I proposed a similar fix:
> > > https://lore.kernel.org/lkml/CAP-5=fW-wHEv=TCULwk_HVOhWHdqRd8AZoESZsU_vnhLjghUBQ@xxxxxxxxxxxxxx/
> > > but Haocheng noted it reintroduced the race condition the original fix
> > > was targeting. Haocheng has a larger fix in:
> > > https://lore.kernel.org/lkml/20260306093616.84299-1-yuhaocheng035@xxxxxxxxx/
> > > that should handle both the original race condition and the deadlock.
> >
> > Oh, this fix is too large.
> >
> > > I wonder that fix may be made smaller by passing a parameter like
> > > "holds_event_mmap_mutex" to perf_mmap_close, something like:
> > > ```
> > > static void __perf_mmap_close(struct vm_area_struct *vma, struct
> > > perf_event *event, bool holds_event_mmap_lock)
> > > {
> > > ... // code from original perf_mmap_close
> > > if ((!holds_event_mmap_lock &&
> > > !refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex))
> > > ||
> > > (holds_event_mmap_lock && !refcount_dec_and_test(&event->mmap_count)))
> > > ...
> > >
> > > static void perf_mmap_close(struct vm_area_struct *vma)
> > > {
> > > struct perf_event *event = vma->vm_file->private_data;
> > > __perf_mmap_close(vma, event, /*holds_event_mmap_lock=*/false);
> > > }
> > >
> > > static void perf_mmap_close_locked(struct vm_area_struct *vma, struct
> > > perf_event *event)
> > > {
> > > __perf_mmap_close(vma, event, /*holds_event_mmap_lock=*/true);
> > > }
> > > ```
> > >
> > > Thanks,
> > > Ian
> >
> > LGTM, your fix is better.
> >
> > But, can you explain why there is still race issue after applying my patch?
>
> Hi Qing,
>
> Haocheng has a complex explanation here:
> https://lore.kernel.org/lkml/CAAoXzSoYPvE-AdhT28Rhik2BtFhFuntaM6uhrMqKLUaDgkyiNg@xxxxxxxxxxxxxx/
> with the original cause, which I think stemmed from syzkaller fuzzing.
> I can follow the explanation there but I'm not familiar enough with
> the ring buffer code to know whether the race with open is benign or
> not. I'm guessing Peter Z thought not, hence the original patch
> landed.

I think no fix has landed yet and the original patch with the issue
has appeared in backports. Haocheng, did you want to see if you could
incorporate my feedback into your fix?

Thanks,
Ian

> Thanks,
> Ian
>
> > Thanks,
> > Qing