Re: Deadlock cpuctx_mutex / pmus_lock / &mm->mmap_lock#2

From: Peter Zijlstra
Date: Thu Nov 19 2020 - 09:19:32 EST


On Thu, Nov 19, 2020 at 01:25:11PM +0000, Chris Wilson wrote:
> Quoting Peter Zijlstra (2020-11-19 13:02:44)
> >
> > Chris, I suspect this is due to i915 calling stop machine with all sorts
> > of locks held. Is there anything to be done about this? stop_machine()
> > is really nasty to begin with.
> >
> > What problem is it typing to solve?
>
> If there is any concurrent access through a PCI bar (that is exported to
> userspace via mmap) as the GTT is updated, results in undefined HW
> behaviour (where that is not limited to users writing to other system
> pages).
>
> stop_machine() is the most foolproof method we know that works.

Sorry, I don't understand. It tries to do what? And why does it need to
do that holding locks.

Really, this is very bad form.

> This particular cycle is easy to break by moving the copy_to_user to
> after releasing perf_event_ctx_unlock in perf_read().

The splat in question is about the ioctl()s, but yeah that too. Not sure
how easy that is. I'm also not sure that'll solve your problem,
cpu_hotplug_lock is a big lock, there's tons of stuff inside.