Re: [PATCH v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly

From: Abhinav Kumar
Date: Thu Mar 28 2024 - 22:16:30 EST




On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx> wrote:



On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx> wrote:



On 3/28/2024 1:58 PM, Stephen Boyd wrote:
Quoting Abhinav Kumar (2024-03-28 13:24:34)
+ Johan and Bjorn for FYI

On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
For internal HPD case, hpd_event_thread is created to handle HPD
interrupts generated by HPD block of DP controller. It converts
HPD interrupts into events and executed them under hpd_event_thread
context. For external HPD case, HPD events is delivered by way of
dp_bridge_hpd_notify() under thread context. Since they are executed
under thread context already, there is no reason to hand over those
events to hpd_event_thread. Hence dp_hpd_plug_handle() and
dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().

Signed-off-by: Kuogee Hsieh <quic_khsieh@xxxxxxxxxxx>
---
drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)


Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")

Is this a bug fix or an optimization? The commit text doesn't tell me.


I would say both.

optimization as it avoids the need to go through the hpd_event thread
processing.

bug fix because once you go through the hpd event thread processing it
exposes and often breaks the already fragile hpd handling state machine
which can be avoided in this case.

Please add a description for the particular issue that was observed
and how it is fixed by the patch.

Otherwise consider there to be an implicit NAK for all HPD-related
patches unless it is a series that moves link training to the enable
path and drops the HPD state machine completely.

I really mean it. We should stop beating a dead horse unless there is
a grave bug that must be fixed.


I think the commit message is explaining the issue well enough.

This was not fixing any issue we saw to explain you the exact scenario
of things which happened but this is just from code walkthrough.

Like kuogee wrote, hpd event thread was there so handle events coming
out of the hpd_isr for internal hpd cases. For the hpd_notify coming
from pmic_glink or any other extnernal hpd cases, there is no need to
put this through the hpd event thread because this will only make things
worse of exposing the race conditions of the state machine.

Moving link training to enable and removal of hpd event thread will be
worked on but delaying obvious things we can fix does not make sense.

From the commit message this feels like an optimisation rather than a
fix. And granted the fragility of the HPD state machine, I'd prefer to
stay away from optimisations. As far as I understood from the history
of the last revert, we'd better make sure that HPD handling goes only
through the HPD event thread.


I think you are mixing the two. We tried to send the events through DRM's hpd_notify which ended up in a bad way and btw, thats still not resolved even though I have seen reports that things are fine with the revert, we are consistently able to see us ending up in a disconnected state with all the reverts and fixes in our x1e80100 DP setup.

I plan to investigate that issue properly in the next week and try to make some sense of it all.

In fact, this patch is removing one more user of the hpd event thread which is the direction in which we all want to head towards.

On whether this is an optimization or a bug fix. I think by avoiding hpd event thread (which should have never been used for hpd_notify updates, hence a bug) we are avoiding the possibility of more race conditions.

So, this has my R-b and it holds. Upto you.