Re: [PATCH 06/11] drm/panthor: Don't overrule pending immediate ticks in sched_resume_tick()

From: Boris Brezillon

Date: Fri Jun 26 2026 - 09:19:34 EST


On Fri, 26 Jun 2026 13:45:38 +0100
Liviu Dudau <liviu.dudau@xxxxxxx> wrote:

> On Thu, Jun 25, 2026 at 02:40:32PM +0200, Boris Brezillon wrote:
> > We schedule immediate ticks when we need to process events on CSGs,
> > but those immediate ticks don't change the resched_target because we
> > want the other groups to stay scheduled for the remaining of the GPU
> > timeslot they were given. Make sure these immediate ticks don't get
> > overruled by a sched_queue_delayed_work() that would delay the tick
> > execution.
> >
> > Fixes: 99820b4b7e50 ("drm/panthor: Make sure we resume the tick when new jobs are submitted")
> > Reported-by: sashiko-bot@xxxxxxxxxx
> > Closes: https://sashiko.dev/#/patchset/20260625-panthor-signal-from-irq-v4-0-3d2908912afa@xxxxxxxxxxxxx?part=9
> > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
> > ---
> > drivers/gpu/drm/panthor/panthor_sched.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index 8fd4d97b062e..ab3e13e44a26 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -2667,7 +2667,14 @@ static void sched_resume_tick(struct panthor_device *ptdev)
> > else
> > delay_jiffies = 0;
> >
> > - sched_queue_delayed_work(sched, tick, delay_jiffies);
> > + /* We schedule immediate ticks when we need to process events on CSGs,
> > + * but those don't change the resched_target because we want the other
> > + * groups to stay scheduled for the remaining of the GPU timeslot they
> > + * were given. Make sure those immediate ticks don't get overruled by
> > + * a sched_queue_delayed_work() that would delay the tick execution.
> > + */
> > + if (!delayed_work_pending(&sched->tick_work))
> > + sched_queue_delayed_work(sched, tick, delay_jiffies);
>
> Maybe I'm having a Friday heat brain freeze, but it feels like the comment and the code
> are going in a different direction. It doesn't help that the commit message copies the
> comment so I can't tell if I'm misreading the comment or there was a different intent.

There's basically two kind of ticks:

1. The periodic ones that serve as a way to rotate groups on the slots
and give everyone a chance to get a GPU slice
2. The immediate ones which are there to process events coming from an
interrupt, or to re-evaluate groups to schedule because a new group
became active

To detect the kind of tick, we use ::resched_target. If current time is
before this target, this is an event, and resident groups shouldn't be
evicted (I'm intentionally eluding RT groups to keep things simple).

The problem we have with sched_resume_tick() is that it's
unconditionally calling sched_queue_delayed_work()
(mod_delayed_work() internally). So, if we have an immediate tick
pending (one that didn't adjust ::resched_target), we end up
rescheduling the work to a later point thus delaying the processing of
this asynchronous event for no good reason. What this commit does is
skip the sched_queue_delayed_work() if the tick_work is pending.

I'd be happy to change the wording if you have something to propose.