Re: [RT] i915 sleeping function from atomic in gen6_reset_engines()

From: Mike Galbraith
Date: Mon Jul 10 2023 - 06:12:37 EST


On Mon, 2023-07-10 at 09:21 +0200, Juri Lelli wrote:
> Hi!
>
> On 23/01/23 17:04, Juri Lelli wrote:
> > Hi,
> >
> > I've just noticed the following while testing v6.2-rc3-rt1.
>
> I'm still seeing the following on v6.4-rt6.
>
> I believe 20211006164628.s2mtsdd2jdbfyf7g@xxxxxxxxxxxxx should cure it,
> but I don't think it did go anywhere?

The raw lock cyclictest deltas don't look great. I shut it up with the
below. For my i915 equipped lappy, a raw lock fix would be just fine,
its cyclictest numbers already being well south of 'oh dear'.

Subject: drm,i915: Don't disable preemption in __intel_gt_reset() for PREEMPT_RT
From: Mike Galbraith <efault@xxxxxx>
Date: Wed Dec 7 10:13:21 CET 2022

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 330, name: systemd-udevd
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
[<ffffffffa090e08e>] __intel_gt_reset+0x6e/0xe0 [i915]
CPU: 0 PID: 330 Comm: systemd-udevd Tainted: G I E 6.1.0.g8ed710d-master-rt #27
Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
Call Trace:
<TASK>
dump_stack_lvl+0x33/0x46
? __intel_gt_reset+0x6e/0xe0 [i915]
__might_resched+0x162/0x1b0
? __gen11_reset_engines.isra.21+0x2d0/0x2d0 [i915]
rt_spin_lock+0x2d/0x70
gen8_reset_engines+0x33/0x220 [i915]
? __gen11_reset_engines.isra.21+0x2d0/0x2d0 [i915]
__intel_gt_reset+0x79/0xe0 [i915]
sanitize_gpu.part.17+0x2d/0x40 [i915]
i915_driver_probe+0x7b8/0xf20 [i915]
...

Replace the preempt_disable()/enable() with a local_lock()/unlock().

Signed-off-by: Mike Galbraith <efault@xxxxxx>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -6,6 +6,7 @@
#include <linux/sched/mm.h>
#include <linux/stop_machine.h>
#include <linux/string_helpers.h>
+#include <linux/local_lock.h>

#include "display/intel_display_reset.h"
#include "display/intel_overlay.h"
@@ -766,6 +767,14 @@ wa_14015076503_end(struct intel_gt *gt,
HECI_H_GS1_ER_PREP, 0);
}

+struct reset_lock {
+ local_lock_t lock;
+};
+
+static DEFINE_PER_CPU(struct reset_lock, reset_lock) = {
+ .lock = INIT_LOCAL_LOCK(lock),
+};
+
int __intel_gt_reset(struct intel_gt *gt, intel_engine_mask_t engine_mask)
{
const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
@@ -788,9 +797,9 @@ int __intel_gt_reset(struct intel_gt *gt
reset_mask = wa_14015076503_start(gt, engine_mask, !retry);

GT_TRACE(gt, "engine_mask=%x\n", reset_mask);
- preempt_disable();
+ local_lock(&reset_lock.lock);
ret = reset(gt, reset_mask, retry);
- preempt_enable();
+ local_unlock(&reset_lock.lock);

wa_14015076503_end(gt, reset_mask);
}