Re: [PATCH v3 6/7] drm/msm/dp: add pm_runtime_force_suspend()/resume()

From: Stephen Boyd
Date: Mon Oct 02 2023 - 18:58:13 EST


Quoting Abhinav Kumar (2023-09-28 17:46:11)
> On 9/27/2023 3:01 PM, Stephen Boyd wrote:
> > Quoting Kuogee Hsieh (2023-09-25 09:07:18)
> >>
> >> However for external DP case, link training can not be guarantee always
> >> success without link rate or lane being reduced as Abhinav mentioned.
> >>
> >> In addition,  CTS (compliance test) it required to complete link
> >> training within 10ms after hpd asserted.
> >
> > Is it possible to change that timeout? I have to look around for the CTS
> > parameters because I'm pretty confused how it can work. What do we do if
> > DP wakes the system from suspend and asserts HPD? We need resume time to
> > be < 10ms? That's not realistic.
> >
>
> No, the CTS doesnt say we need to finish link training within 10ms after
> HPD is asserted. It says it must be completed in 10ms after
> TRAINING_PATTERN_SET dpcd write.
>
> "Wait until the Source DUT writes 00h to the TRAINING_PATTERN_SET byte
> of Reference Sink DPCD Link Configuration Field to indicate the end of
> the link training. Stop the link training timer. Verify that link
> training completed in 10ms or less"
>
> That needs to be done independent of HPD so we can ignore the CTS point.

Great!

>
> >>
> >> I am not sure do link training at atomic_enable() can meet this timing
> >> requirement.

Why? It's putting some time bound on link training in general to only
take 10ms, right?

> >>
> >
> > At least in the DP spec itself it doesn't require the link to be trained
> > within 10ms of HPD being asserted. Instead it simply recommends that the
> > OS start configuring the display promptly after HPD is asserted, e.g.
> > within 100ms. There's some strict timing on IRQ_HPD, so the driver must
> > read DPCD registers within 100ms of IRQ_HPD rising edge; maybe that is
> > what CTS is checking for?
> >
> > TL;DR: I don't see why CTS should stop us from link training in
> > atomic_enable(). It would be beneficial to do so to make eDP and DP the
> > same. It would also help to report a drm connector being connected
> > _before_ link training so that userspace knows the link itself is the
> > bad part of the equation (and not that the DP connector looks
> > disconnected to userspace when in fact it really is connected and the
> > monitor is asserting HPD, just the link training failed).
>
> Its the corrective action of the userspace when it finds link is bad is
> the concern as I highlighted in the other response. Just reading and
> resetting link_status is not enough to recover.

What needs to be done to recover? Userspace will try to set a mode on
the connector again if the link status is bad and there were some modes
available. If there are zero modes and the link is bad, then it ignores
the connector. I'm not sure what else could be done to recover besides
try again and stop trying if no modes exist.

Acting like the connector isn't connected makes the situation worse for
ChromeOS because userspace thinks there's nothing there so it can't try
to retrain the link again. Instead, userspace has to rely on the kernel
driver to train the link again. The kernel should just tell userspace
the link is bad so userspace can implement the policy to either ignore
the connector entirely or to consider it a display that is having link
training problems.

So again, I see no reason why the kernel driver thinks it can implement
a policy to train the link before indicating the drm connector is
connected. It should stop doing that. Instead it should tell userspace
that the connector is connected and then train the link when there's a
modeset. If the modeset fails then userspace can take action to either
figure out that the link is bad, or notify the user that the cable is
bad, or to try replugging or power cycle the monitor, etc. None of that
can be done if the kernel lies about the state of the connector because
the link training failed.