Re: [Freedreno] [PATCH v9 15/15] drm/i915: Add deadline based boost support

From: Rob Clark
Date: Fri Mar 03 2023 - 10:41:46 EST


On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
>
>
> On 03/03/2023 14:48, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >>>> From: Rob Clark <robdclark@xxxxxxxxxxxx>
> >>>>
> >>>
> >>> missing some wording here...
> >>>
> >>>> v2: rebase
> >>>>
> >>>> Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
> >>>> ---
> >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>>> 1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>>> index 7503dcb9043b..44491e7e214c 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>>> return i915_request_enable_breadcrumb(to_request(fence));
> >>>> }
> >>>>
> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >>>> +{
> >>>> + struct i915_request *rq = to_request(fence);
> >>>> +
> >>>> + if (i915_request_completed(rq))
> >>>> + return;
> >>>> +
> >>>> + if (i915_request_started(rq))
> >>>> + return;
> >>>
> >>> why do we skip the boost if already started?
> >>> don't we want to boost the freq anyway?
> >>
> >> I'd wager Rob is just copying the current i915 wait boost logic.
> >
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
>
> Hm, there I have preserved this same !i915_request_started logic.
>
> Presumably it's not just fewer boosts but lower performance. How is he
> setting the deadline? Somehow from clFlush or so?

Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa
hack to set the DEADLINE flag on syncobj waits, but it seems likely
that I missed something somewhere

BR,
-R

> Regards,
>
> Tvrtko
>
> P.S. Take note that I did not post the latest version of my RFC. The one
> where I fix the fence chain and array misses you pointed out. I did not
> think it would be worthwhile given no universal love for it, but if
> people are testing with it more widely that I was aware perhaps I should.
>
> >>>> +
> >>>> + /*
> >>>> + * TODO something more clever for deadlines that are in the
> >>>> + * future. I think probably track the nearest deadline in
> >>>> + * rq->timeline and set timer to trigger boost accordingly?
> >>>> + */
> >>>
> >>> I'm afraid it will be very hard to find some heuristics of what's
> >>> late enough for the boost no?
> >>> I mean, how early to boost the freq on an upcoming deadline for the
> >>> timer?
> >>
> >> We can off load this patch from Rob and deal with it separately, or
> >> after the fact?
> >
> > That is completely my intention, I expect you to replace my i915 patch ;-)
> >
> > Rough idea when everyone is happy with the core bits is to setup an
> > immutable branch without the driver specific patches, which could be
> > merged into drm-next and $driver-next and then each driver team can
> > add there own driver patches on top
> >
> > BR,
> > -R
> >
> >> It's a half solution without a smarter scheduler too. Like
> >> https://lore.kernel.org/all/20210208105236.28498-10-chris@xxxxxxxxxxxxxxxxxx/,
> >> or if GuC plans to do something like that at any point.
> >>
> >> Or bump the priority too if deadline is looming?
> >>
> >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
> >> basis. For instance I have a new heuristics which improves the
> >> problematic OpenCL cases for further 5% (relative to the current
> >> waitboost improvement from adding missing syncobj waitboost). But I
> >> can't really test properly for regressions over platforms, stacks,
> >> workloads.. :(
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>>
> >>>> +
> >>>> + intel_rps_boost(rq);
> >>>> +}
> >>>> +
> >>>> static signed long i915_fence_wait(struct dma_fence *fence,
> >>>> bool interruptible,
> >>>> signed long timeout)
> >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> >>>> .signaled = i915_fence_signaled,
> >>>> .wait = i915_fence_wait,
> >>>> .release = i915_fence_release,
> >>>> + .set_deadline = i915_fence_set_deadline,
> >>>> };
> >>>>
> >>>> static void irq_execute_cb(struct irq_work *wrk)
> >>>> --
> >>>> 2.39.1
> >>>>