Re: [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression)

From: Alex Deucher

Date: Wed Mar 04 2026 - 11:38:39 EST


+ Harry, Leo, Bhuvana

On Wed, Mar 4, 2026 at 8:42 AM Rafael Passos <rafael@xxxxxxxxxxx> wrote:
>
> I found the issue, but I'm still not sure how to proceed.
> I would like some guidance in fixing this regression.
>
> The issue is the where a Register is being read from.
> Before this change, the MICROSECOND_TIME_BASE_DIV reg wa read from
> dce_hwseq_registers (dce_hwseq.h) and now from dccg_registers (dcn20_dccg.h)
>
> The bisection lead me to this commit: 4c595e75110ece20af3a68c1ebef8ed4c1b69afe
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> After lot of debugging, I traced the issue to this file:
> drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> This card is dcn21, but it uses most of the dcn20 implementation.
> For easy comparison, the following block contains the function with the original path
> commented out (from dcn21), and the function it calls from dcn20:
>
> ```
> bool dcn21_s0i3_golden_init_wa(struct dc *dc)
> {
> if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done){
>
> printk(KERN_CRIT "AUYER in %s", __func__);
> return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
> }
>
> printk(KERN_CRIT "AUYER in %s", __func__);
>
> return false;
>
> // original flow:
> // struct dce_hwseq *hws = dc->hwseq;
> // uint32_t value = 0;
> // value = REG_READ(MICROSECOND_TIME_BASE_DIV);
>
> // return value != 0x00120464;
> }
>
> // is_s0i3_golden_init_wa_done -> dccg2_is_s0i3_golden_init_wa_done
> bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
> {
> struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
> return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
> }
> ```
>
> I instrumented this code to compare the values.
> On boot, the value is the same. When resuming from s3 sleep, different.
> If using the output of this codepath before this commit, the screen works.
> At the end of this email is my "debugging patch", and the logs comparing what shows
> up on boot vs on resuming from sleep.
>
> I am attempting to implement a `dccg21_is_s0i3_golden_init_wa_done` to
> replace the `dccg2_is_s0i3_golden_init_wa_done` that is used in dcn21_dccg.c.
> Maybe dcn21 needs a separate register page, (insted of using dcn20_dccg.h)?
>
>
> Note the difference between log line 2 and 5
> [ 4.956404] [ T316] AUYER PATCHED in dcn21_s0i3_golden_init_wa, values compared to 0x00120464
> [ 4.956407] [ T316] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, bool: 1
> [ 4.956411] [ T316] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> [ 4.956412] [ T316] AUYER in dccg21_is_s0i3_golden_init_wa_done
> [ 4.956415] [ T316] AUYER in dccg21_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> [ 4.956418] [ T316] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as bool 1
>
>
> 1 [ 4.942660] [ T343] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [ 4.942662] [ T343] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, comparing to 0x00120464 bool: 1
> 3 [ 4.942665] [ T343] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> 4 [ 4.942668] [ T343] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [ 4.942671] [ T343] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
> On wake from S3:
>
> 1 [ 279.431636] [ T5497] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [ 279.431638] [ T5497] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180772, comparing to 0x00120464 bool: 0
> 3 [ 279.431640] [ T5497] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180772
> 4 [ 279.431641] [ T5497] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [ 279.431642] [ T5497] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
>
> The "patch" (just a test lab), to understad where these logs came from.
> I applies cleanly to amddrm drm-next, and mainline.
>
> ---
> .../amd/display/dc/dccg/dcn20/dcn20_dccg.c | 3 +++
> .../amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 25 ++++++++++++++++---
> 2 files changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> index 13ba7f5ce13e..0ba20c7969ed 100644
> --- a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> +++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> @@ -158,6 +158,9 @@ bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
> {
> struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
> + printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x, value: %d",
> + __func__, dccg_dcn->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
> return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
> }
>
> diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> index 062745389d9a..143c552e0fa9 100644
> --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> @@ -88,10 +88,28 @@ int dcn21_init_sys_ctx(struct dce_hwseq *hws, struct dc *dc, struct dc_phy_addr_
>
> bool dcn21_s0i3_golden_init_wa(struct dc *dc)
> {
> - if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done)
> - return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
>
> - return false;
> + printk(KERN_CRIT "AUYER PATCHED in %s, values compared to 0x00120464", __func__);
> +
> + // original flow
> + struct dce_hwseq *hws = dc->hwseq;
> + uint32_t value = 0;
> + value = REG_READ(MICROSECOND_TIME_BASE_DIV);
> +
> + printk(KERN_CRIT "AUYER in %s, original flow value: %d, bool: %d",
> + __func__, value, value != 0x00120464);
> +
> + printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x value: %d",
> + __func__, hws->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
> + if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done) {
> + // new flow
> + bool v2 = 0;
> + v2 = !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
> + printk(KERN_CRIT "AUYER in %s, NEW flow value as bool %d", __func__, v2);
> + }
> +
> + return value != 0x00120464;
> }
>
> void dcn21_exit_optimized_pwr_state(
> @@ -298,4 +316,3 @@ bool dcn21_is_abm_supported(struct dc *dc,
> }
> return false;
> }
> -
> --
> 2.53.0
>
>