Re: [PATCH] drm/panfrost: Ignore core_mask for poweroff and sync interrupts

From: Boris Brezillon
Date: Fri Nov 24 2023 - 05:21:58 EST


On Fri, 24 Nov 2023 11:12:57 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
wrote:

> Il 24/11/23 10:17, AngeloGioacchino Del Regno ha scritto:
> > Il 23/11/23 16:40, Boris Brezillon ha scritto:
> >> On Thu, 23 Nov 2023 16:14:12 +0100
> >> AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
> >> wrote:
> >>
> >>> Il 23/11/23 14:51, Boris Brezillon ha scritto:
> >>>> On Thu, 23 Nov 2023 14:24:57 +0100
> >>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
> >>>> wrote:
> >>>>>>>
> >>>>>>> So, while I agree that it'd be slightly more readable as a diff if those
> >>>>>>> were two different commits I do have reasons against splitting.....
> >>>>>>
> >>>>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
> >>>>>> in when we power-off the cores, I think we'd be better off dropping
> >>>>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
> >>>>>> at [re]initialization time, and then have a separate series that fixes
> >>>>>> the problem more generically.
> >>>>>
> >>>>> But that didn't work:
> >>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@xxxxxxxxxx/
> >>>>
> >>>> I meant, your 'ignore-core_mask' fix + the
> >>>> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
> >>>>
> >>>> So,
> >>>>
> >>>> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@xxxxxxxxxxxxx/
> >>>> +
> >>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@xxxxxxxxxx/
> >>>>>
> >>>>>
> >>>>> ...while this "full" solution worked:
> >>>>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@xxxxxxxxxx/
> >>>>>
> >>>>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@xxxxxxxxxx/
> >>>>>
> >>>>>
> >>>>> ...so this *is* a "quick fix" already... :-)
> >>>>
> >>>> It's a half-baked solution for the missing irq-synchronization-on-suspend
> >>>> issue IMHO. I understand why you want it all in one patch that can serve
> >>>> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
> >>>> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
> >>>> even simpler diff (see below), and then fully address the
> >>>> irq-synhronization-on-suspend issue in a follow-up patchset.
> >>>> --->8---
> >>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> index 09f5e1563ebd..6e2d7650cc2b 100644
> >>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
> >>>>           }
> >>>>           gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> >>>> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
> >>
> >> We probably want a comment here:
> >>
> >>     /* Only enable the interrupts we care about. */
> >>
> >>>> +       gpu_write(pfdev, GPU_INT_MASK,
> >>>> +                 GPU_IRQ_MASK_ERROR |
> >>>> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
> >>>> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
> >>>
> >>> ...but if we do that, the next patch(es) will contain a partial revert of this
> >>> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...
> >>
> >> Why should we revert it? We're not processing the PWRTRANS interrupts
> >> in the interrupt handler, those should never have been enabled in the
> >> first place. The only reason we'd want to revert that change is if we
> >> decide to do have interrupt-based waits in the poweron/off
> >> implementation, which, as far as I'm aware, is not something we intend
> >> to do any time soon.
> >>
> >
> > You're right, yes. Okay, I'll push the new code soon.
> >
> > Cheers!
> >
>
> Update: I was running some (rather fast) tests here because I ... felt like playing
> with it, basically :-)
>
> So, I had an issue with MediaTek platforms being unable to cut power to the GPU or
> disable clocks aggressively... and after trying "this and that" I couldn't get it
> working (in runtime suspend).
>
> Long story short - after implementing `panfrost_{job,mmu,gpu}_suspend_irq()` (only
> gpu irq, as you said, is a half solution), I can not only turn off clocks, but even
> turn off GPU power supplies entirely, bringing the power consumption of the GPU
> itself during *runtime* suspend to ... zero.

Very nice!

>
> The result of this test makes me truly happy, even though complete powercut during
> runtime suspend may not be feasible for other reasons (takes ~200000ns on AVG,
> MIN ~160000ns, but the MAX is ~475000ns - and beware that I haven't run that for
> long, I'd suspect to get up to 1-1.5ms as max time, so that's a big no).

Do you know what's taking so long? I'm disabling clks + the main power
domain in panthor (I leave the regulators enabled), but I didn't get to
measure the time it takes to enter/exit suspend. I might have to do
what you did in panfrost and have different paths for system and RPM
suspend.

>
> This means that I will take a day or two and I'll push both the "simple" fix for
> the Really-power-off and also some more commits to add the full irq sync.

Thanks for working on that, and sorry if I've been picky in my previous
reviews. Looking forward to review these new patches.

Regards,

Boris