Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

From: Boris Brezillon
Date: Fri Oct 27 2023 - 12:26:37 EST


On Fri, 27 Oct 2023 16:34:26 +0200
Danilo Krummrich <dakr@xxxxxxxxxx> wrote:

> On 10/27/23 09:17, Boris Brezillon wrote:
> > Hi Danilo,
> >
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich <dakr@xxxxxxxxxx> wrote:
> >
> >> +
> >> + /**
> >> + * @update_job_credits: Called once the scheduler is considering this
> >> + * job for execution.
> >> + *
> >> + * Drivers may use this to update the job's submission credits, which is
> >> + * useful to e.g. deduct the number of native fences which have been
> >> + * signaled meanwhile.
> >> + *
> >> + * The callback must either return the new number of submission credits
> >> + * for the given job, or zero if no update is required.
> >> + *
> >> + * This callback is optional.
> >> + */
> >> + u32 (*update_job_credits)(struct drm_sched_job *sched_job);
> >
> > I'm copying my late reply to v2 here so it doesn't get lost:
> >
> > I keep thinking it'd be simpler to make this a void function that
> > updates s_job->submission_credits directly. I also don't see the
> > problem with doing a sanity check on job->submission_credits. I mean,
> > if the driver is doing something silly, you can't do much to prevent it
> > anyway, except warn the user that something wrong has happened. If you
> > want to
> >
> > WARN_ON(job->submission_credits == 0 ||
> > job->submission_credits > job_old_submission_credits);
> >
> > that's fine. But none of this sanity checking has to do with the
> > function prototype/semantics, and I'm still not comfortable with this 0
> > => no-change. If there's no change, we should just leave
> > job->submission_credits unchanged (or return job->submission_credits)
> > instead of inventing a new special case.
>
> If we can avoid letting drivers change fields of generic structures directly
> without any drawbacks I think we should avoid it. Currently, drivers shouldn't
> have the need to mess with job->credits directly. The initial value is set
> through drm_sched_job_init() and is updated through the return value of
> update_job_credits().

Fair enough. I do agree that keeping internal fields out of driver
hands is a good thing in general, it's just that it's already
free-for-all in so many places in drm_sched (like the fact drivers
iterate the pending list in their stop-queue handling) that I didn't
really see it as an issue. Note that's there's always the option of
providing drm_sched_job_{update,get}_credits() helpers, with the update
helper making sure the new credits value is consistent (smaller or
equal to the old one, and not zero).

>
> I'm fine getting rid of the 0 => no-change semantics though. Instead we can just
> WARN() on 0.

Yeah, I think that's preferable. It's pretty easy to return the old
value if the driver has a way to detect when nothing changed (with a
get helper if you don't want drivers to touch the credits field).

> However, if we do that I'd also want to change it for
> drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 0, but
> WARN() accordingly.

Sure. You update all drivers anyway, so passing 1 instead of 0 is not a
big deal, I would say.

>
> I think it's consequent to either consistently give 0 a different meaning or just
> accept it but WARN() on it.

Using default as a default value makes sense when you're passing
zero-initialized objects that are later extended with new fields, but
here you update the function prototype and all the call sites, so we're
better off considering 0 as an invalid value, IMHO.