[PATCH v3 5/5] drm/msm/dpu: rate limit snapshot capture for mmu faults

From: Jessica Zhang
Date: Wed Feb 19 2025 - 14:50:18 EST


From: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>

There is no recovery mechanism in place yet to recover from mmu
faults for DPU. We can only prevent the faults by making sure there
is no misconfiguration.

Rate-limit the snapshot capture for mmu faults to once per
msm_atomic_commit_tail() as that should be sufficient to capture
the snapshot for debugging otherwise there will be a lot of DPU
snapshots getting captured for the same fault which is redundant
and also might affect capturing even one snapshot accurately.

Signed-off-by: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>
Signed-off-by: Jessica Zhang <quic_jesszhan@xxxxxxxxxxx>
---
Changes in v3:
- Clear fault_snapshot_capture before calling prepare_commit() (Dmitry)
- Make fault_snapshot_capture an atomic variable (Dmitry, Abhinav)
---
drivers/gpu/drm/msm/msm_atomic.c | 2 ++
drivers/gpu/drm/msm/msm_kms.c | 5 ++++-
drivers/gpu/drm/msm/msm_kms.h | 3 +++
3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c
index 9c45d641b521..44b402e70925 100644
--- a/drivers/gpu/drm/msm/msm_atomic.c
+++ b/drivers/gpu/drm/msm/msm_atomic.c
@@ -221,6 +221,8 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
kms->funcs->wait_flush(kms, crtc_mask);
trace_msm_atomic_wait_flush_finish(crtc_mask);

+ atomic_set(&kms->fault_snapshot_capture, 0);
+
/*
* Now that there is no in-progress flush, prepare the
* current update:
diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
index 1d3dae3d4c93..171462a8e08c 100644
--- a/drivers/gpu/drm/msm/msm_kms.c
+++ b/drivers/gpu/drm/msm/msm_kms.c
@@ -168,7 +168,10 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
{
struct msm_kms *kms = arg;

- msm_disp_snapshot_state(kms->dev);
+ if (atomic_read(&kms->fault_snapshot_capture) == 0) {
+ msm_disp_snapshot_state(kms->dev);
+ atomic_inc(&kms->fault_snapshot_capture);
+ }

return -ENOSYS;
}
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index e60162744c66..3e28c4e012d2 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -128,6 +128,9 @@ struct msm_kms {
int irq;
bool irq_requested;

+ /* rate limit the snapshot capture to once per attach */
+ atomic_t fault_snapshot_capture;
+
/* mapper-id used to request GEM buffer mapped for scanout: */
struct msm_gem_address_space *aspace;


--
2.48.1