Re: [Intel-gfx] [PATCH 1/2] drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock

From: Tvrtko Ursulin
Date: Fri Oct 06 2017 - 10:44:54 EST



On 06/10/2017 15:23, Daniel Vetter wrote:
On Fri, Oct 06, 2017 at 12:34:02PM +0100, Tvrtko Ursulin wrote:

On 06/10/2017 10:06, Daniel Vetter wrote:
4.14-rc1 gained the fancy new cross-release support in lockdep, which
seems to have uncovered a few more rules about what is allowed and
isn't.

This one here seems to indicate that allocating a work-queue while
holding mmap_sem is a no-go, so let's try to preallocate it.

Of course another way to break this chain would be somewhere in the
cpu hotplug code, since this isn't the only trace we're finding now
which goes through msr_create_device.

Full lockdep splat:

[snipped lockdep splat]

v2: Set ret correctly when we raced with another thread.

v3: Use Chris' diff. Attach the right lockdep splat.

Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Sasha Levin <alexander.levin@xxxxxxxxxxx>
Cc: Marta Lofstedt <marta.lofstedt@xxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem_userptr.c | 35 +++++++++++++++++++--------------
1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 2d4996de7331..f9b3406401af 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -164,7 +164,6 @@ static struct i915_mmu_notifier *
i915_mmu_notifier_create(struct mm_struct *mm)
{
struct i915_mmu_notifier *mn;
- int ret;
mn = kmalloc(sizeof(*mn), GFP_KERNEL);
if (mn == NULL)
@@ -179,14 +178,6 @@ i915_mmu_notifier_create(struct mm_struct *mm)
return ERR_PTR(-ENOMEM);
}
- /* Protected by mmap_sem (write-lock) */
- ret = __mmu_notifier_register(&mn->mn, mm);
- if (ret) {
- destroy_workqueue(mn->wq);
- kfree(mn);
- return ERR_PTR(ret);
- }
-
return mn;
}
@@ -210,23 +201,37 @@ i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
static struct i915_mmu_notifier *
i915_mmu_notifier_find(struct i915_mm_struct *mm)
{
- struct i915_mmu_notifier *mn = mm->mn;
+ struct i915_mmu_notifier *mn;
+ int err;
mn = mm->mn;
if (mn)
return mn;
+ mn = i915_mmu_notifier_create(mm->mm);
+ if (IS_ERR(mn))
+ return mn;

Strictly speaking we don't want to fail just yet, only it we actually needed
a new notifier and we failed to create it.

The check 2 lines above not good enough? It's somewhat racy, but I'm not
sure what value we provide by being perfectly correct against low memory.
This thread racing against a 2nd one, where the minimal allocation of the
2nd one pushed us perfectly over the oom threshold seems a very unlikely
scenario.

Also, small allocations actually never fail :-)

Yes, but, we otherwise make each other re-spin for much smaller things than bailout logic being conceptually at the wrong place. So for me I'd like a respin. It's not complicated at all, just move the bailout to to before the __mmu_notifier_register:

...

err = 0;
if (IS_ERR(mn))
err = PTR_ERR(..);

...

if (mana->manah == NULL) { /* ;-D */
/* Protect by mmap_sem...
if (err == 0) {
err = __mmu_notifier_register(..);
...
}
}

...

if (mn && !IS_ERR(mn)) {
...free...
}

I think.. ?

R-b on this, plus below, unless I got something wrong.



+
+ err = 0;
down_write(&mm->mm->mmap_sem);
mutex_lock(&mm->i915->mm_lock);
- if ((mn = mm->mn) == NULL) {ed
- mn = i915_mmu_notifier_create(mm->mm);
- if (!IS_ERR(mn))
- mm->mn = mn;
+ if (mm->mn == NULL) {
+ /* Protected by mmap_sem (write-lock) */
+ err = __mmu_notifier_register(&mn->mn, mm->mm);
+ if (!err) {
+ /* Protected by mm_lock */
+ mm->mn = fetch_and_zero(&mn);
+ }
}
mutex_unlock(&mm->i915->mm_lock);
up_write(&mm->mm->mmap_sem);
- return mn;
+ if (mn) {
+ destroy_workqueue(mn->wq);
+ kfree(mn);
+ }
+
+ return err ? ERR_PTR(err) : mm->mn;
}
static int


Otherwise looks good to me.

I would also put a note in the commit on how working around the locking
issue is also beneficial to performance with moving the allocation step
outside the mmap_sem.

Yeah Chris brought that up too, I don't really buy it given how
heavy-weight __mmu_notifier_register is. But I can add something like:

"This also has the minor benefit of slightly reducing the critical
section where we hold mmap_sem."

r-b with that added to the commit message?

I think for me it is more making it clear that we are working around something where we are not strictly broken by design with a, however meaningless, optimisation at the same time.

Regards,

Tvrtko