Re: Deadlock cpuctx_mutex / pmus_lock / &mm->mmap_lock#2
From: Thomas Gleixner
Date: Thu Nov 19 2020 - 10:21:09 EST
On Thu, Nov 19 2020 at 13:25, Chris Wilson wrote:
> Quoting Peter Zijlstra (2020-11-19 13:02:44)
>>
>> Chris, I suspect this is due to i915 calling stop machine with all sorts
>> of locks held. Is there anything to be done about this? stop_machine()
>> is really nasty to begin with.
>>
>> What problem is it typing to solve?
>
> If there is any concurrent access through a PCI bar (that is exported to
> userspace via mmap) as the GTT is updated, results in undefined HW
> behaviour (where that is not limited to users writing to other system
> pages).
>
> stop_machine() is the most foolproof method we know that works.
It's also the biggest hammer and is going to cause latencies just
because even on CPUs which are not involved at all. We have already
enough trouble vs. WBINVD latency wise, so no need to add yet another
way to hurt everyone.
As the gfx muck knows which processes have stuff mapped, there are
certainly ways to make them and only them rendevouz and do so while
staying preemptible otherwise. It might take an RESCHED_IPI to all CPUs
to achieve that, but that's a cheap operation compared to what you want
to do.
Thanks,
tglx