Re: [PATCH v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly

From: Abhinav Kumar
Date: Fri Apr 05 2024 - 20:04:34 EST




On 3/28/2024 10:47 PM, Abhinav Kumar wrote:


On 3/28/2024 8:23 PM, Dmitry Baryshkov wrote:
On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx> wrote:



On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx> wrote:



On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx> wrote:



On 3/28/2024 1:58 PM, Stephen Boyd wrote:
Quoting Abhinav Kumar (2024-03-28 13:24:34)
+ Johan and Bjorn for FYI

On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
For internal HPD case, hpd_event_thread is created to handle HPD
interrupts generated by HPD block of DP controller. It converts
HPD interrupts into events and executed them under hpd_event_thread
context. For external HPD case, HPD events is delivered by way of
dp_bridge_hpd_notify() under thread context. Since they are executed
under thread context already, there is no reason to hand over those
events to hpd_event_thread. Hence dp_hpd_plug_handle() and
dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().

Signed-off-by: Kuogee Hsieh <quic_khsieh@xxxxxxxxxxx>
---
      drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
      1 file changed, 3 insertions(+), 2 deletions(-)


Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")

Is this a bug fix or an optimization? The commit text doesn't tell me.


I would say both.

optimization as it avoids the need to go through the hpd_event thread
processing.

bug fix because once you go through the hpd event thread processing it
exposes and often breaks the already fragile hpd handling state machine
which can be avoided in this case.

Please add a description for the particular issue that was observed
and how it is fixed by the patch.

Otherwise consider there to be an implicit NAK for all HPD-related
patches unless it is a series that moves link training to the enable
path and drops the HPD state machine completely.

I really mean it. We should stop beating a dead horse unless there is
a grave bug that must be fixed.


I think the commit message is explaining the issue well enough.

This was not fixing any issue we saw to explain you the exact scenario
of things which happened but this is just from code walkthrough.

Like kuogee wrote, hpd event thread was there so handle events coming
out of the hpd_isr for internal hpd cases. For the hpd_notify coming
from pmic_glink or any other extnernal hpd cases, there is no need to
put this through the hpd event thread because this will only make things
worse of exposing the race conditions of the state machine.

Moving link training to enable and removal of hpd event thread will be
worked on but delaying obvious things we can fix does not make sense.

  From the commit message this feels like an optimisation rather than a
fix. And granted the fragility of the HPD state machine, I'd prefer to
stay away from optimisations. As far as I understood from the history
of the last revert, we'd better make sure that HPD handling goes only
through the HPD event thread.


I think you are mixing the two. We tried to send the events through
DRM's hpd_notify which ended up in a bad way and btw, thats still not
resolved even though I have seen reports that things are fine with the
revert, we are consistently able to see us ending up in a disconnected
state with all the reverts and fixes in our x1e80100 DP setup.

I plan to investigate that issue properly in the next week and try to
make some sense of it all.

In fact, this patch is removing one more user of the hpd event thread
which is the direction in which we all want to head towards.

As I stated earlier, from my point of view it doesn't make sense to
rework the HPD thread in small steps.

On whether this is an optimization or a bug fix. I think by avoiding hpd
event thread (which should have never been used for hpd_notify updates,
hence a bug) we are avoiding the possibility of more race conditions.

I think that the HPD event thread serializes handling of events, so
avoiding it increases the possibility of a race condition.


So, this has my R-b and it holds. Upto you.

I'd wait for a proper description of the issue that was observed and
how it is solved by this patch.


This was a code walkthrough fix as I wrote a few times. If there no merit in pushing this, lets ignore it and stop discussing.


Ok, so after we debugged the HPD issue on we have found the issue and why actually this change will help. I am going to post a V2 with more details on the commit text. We can discuss after that.