Re: [PATCH v2 0/3] drm: commit_work scheduling
From: Rob Clark
Date: Mon Oct 05 2020 - 19:24:53 EST
On Mon, Oct 5, 2020 at 8:00 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
>
> +CC Steve and Peter - they might be interested.
>
> On 10/02/20 11:07, Rob Clark wrote:
> > On Fri, Oct 2, 2020 at 4:01 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> > >
> > > On 09/30/20 14:17, Rob Clark wrote:
> > > > From: Rob Clark <robdclark@xxxxxxxxxxxx>
> > > >
> > > > The android userspace treats the display pipeline as a realtime problem.
> > > > And arguably, if your goal is to not miss frame deadlines (ie. vblank),
> > > > it is. (See https://lwn.net/Articles/809545/ for the best explaination
> > > > that I found.)
> > > >
> > > > But this presents a problem with using workqueues for non-blocking
> > > > atomic commit_work(), because the SCHED_FIFO userspace thread(s) can
> > > > preempt the worker. Which is not really the outcome you want.. once
> > > > the required fences are scheduled, you want to push the atomic commit
> > > > down to hw ASAP.
> > >
> > > For me thees 2 properties
> > >
> > > 1. Run ASAP
> > > 2. Finish the work un-interrupted
> > >
> > > Scream the workers need to be SCHED_FIFO by default. CFS can't give you these
> > > guarantees.
> >
> > fwiw, commit_work does sleep/block for some time until fences are
> > signalled, but then once that happens we want it to run ASAP,
> > preempting lower priority SCHED_FIFO.
> >
> > >
> > > IMO using sched_set_fifo() for these workers is the right thing.
> > >
> >
> > Possibly, but we still have limited prioritization options (ie. not
> > enough) to set these from the kernel. Giving userspace the control,
> > so it can pick sensible priorities for commit_work and vblank_work,
> > which fits in with the priorities of the other userspace threads seems
> > like the sensible thing.
>
> The problem is that the kernel can run on all types of systems. It's impossible
> to pick one value that fits all. Userspace must manage these priorities, and
> you can still export the TID to help with that.
>
> But why do you need several priorities in your pipeline? I would have thought
> it should execute each stage sequentially and all tasks running at the same RT
> priority is fine.
On the kernel side, vblank work should complete during the vblank
period, making it a harder real time requirement. So the thinking is
this should be a higher priority.
But you are right, if you aren't overcommitted it probably doesn't matter.
> On SMP priorities matter once you've overcomitted the systems. You need to have
> more RT tasks running than CPUs for priorities to matter. It seems you have
> a high count of RT tasks in your system?
>
> I did some profiles on Android and found that being overcomitted is hard. But
> that was a while ago.
>
> >
> > > >
> > > > But the decision of whether commit_work should be RT or not really
> > > > depends on what userspace is doing. For a pure CFS userspace display
> > > > pipeline, commit_work() should remain SCHED_NORMAL.
> > >
> > > I'm not sure I agree with this. I think it's better to characterize tasks based
> > > on their properties/requirements rather than what the rest of the userspace is
> > > using.
> >
> > I mean, the issue is that userspace is already using a few different
> > rt priority levels for different SF threads. We want commit_work to
>
> Why are they at different priorities? Different priority levels means that some
> of them have more urgent deadlines to meet and it's okay to steal execution
> time from lower priority tasks. Is this the case?
tbh, I'm not fully aware of the background. It looks like most of the
SF threads run at priority=2 (100-2==98), and the main one runs at
priority=1
> RT planning and partitioning is not easy task for sure. You might want to
> consider using affinities too to get stronger guarantees for some tasks and
> prevent cross-talking.
There is some cgroup stuff that is pinning SF and some other stuff to
the small cores, fwiw.. I think the reasoning is that they shouldn't
be doing anything heavy enough to need the big cores.
> > run ASAP once fences are signalled, and vblank_work to run at a
> > slightly higher priority still. But the correct choice for priorities
> > here depends on what userspace is using, it all needs to fit together
> > properly.
>
> By userspace here I think you mean none display pipeline related RT tasks that
> you need to coexit with and could still disrupt your pipeline?
I mean, commit_work should be higher priority than the other (display
related) RT tasks. But the kernel doesn't know what those priorities
are.
> Using RT on Gerneral Purpose System is hard for sure. One of the major
> challenge is that there's no admin that has full view of the system to do
> proper RT planning.
>
> We need proper RT balancer daemon that helps partitioning the system for
> multiple RT apps on these systems..
>
> >
> > >
> > > I do appreciate that maybe some of these tasks have varying requirements during
> > > their life time. e.g: they have RT property during specific critical section
> > > but otherwise are CFS tasks. I think the UI thread in Android behaves like
> > > that.
> > >
> > > It's worth IMO trying that approach I pointed out earlier to see if making RT
> > > try to pick an idle CPU rather than preempt CFS helps. Not sure if it'd be
> > > accepted but IMHO it's a better direction to consider and discuss.
> >
> > The problem I was seeing was actually the opposite.. commit_work
> > becomes runnable (fences signalled) but doesn't get a chance to run
> > because a SCHED_FIFO SF thread is running. (Maybe I misunderstood and
> > you're approach would help this case too?)
>
> Ah okay. Sorry I got it the wrong way around for some reason. I thought this
> task is preempting other CFS-based pipelined tasks.
>
> So your system seems to be overcomitted. Is SF short for SufraceFlinger? Under
> what scenarios do you have many SurfaceFlinger tasks? On Android I remember
> seeing they have priority of 1 or 2.
yeah, SF==SurfaceFlinger, and yeah, 1 and 2..
> sched_set_fifo() will use priority 50. If you set all your pipeline tasks
> to this priority, what happens?
I think this would work.. drm/msm doesn't use vblank work, so I
wouldn't really have problems with commit_work preempting vblank_work.
But I think the best option (and to handle the case if android changes
the RT priorties around in the future) is to let userspace set the
priorities.
> >
> > > Or maybe you can wrap userspace pipeline critical section lock such that any
> > > task holding it will automatically be promoted to SCHED_FIFO and then demoted
> > > to CFS once it releases it.
> >
> > The SCHED_DEADLINE + token passing approach that the lwn article
> > mentioned sounds interesting, if that eventually becomes possible.
> > But doesn't really help today..
>
> We were present in the room with Alessio when he gave that talk :-)
>
> You might have seen Valentin's talk in LPC where he's trying to get
> proxy-execution into shape. Which is a pre-requisite to enable using of
> SCHED_DEADLINE for these scenarios. IIRC it should allow all dependent tasks to
> run from the context of the deadline task during the display pipeline critical
> section.
>
> By the way, do you have issues with SoftIrqs delaying your RT tasks execution
> time?
I don't *think* so, but I'm not 100% sure if they are showing up in
traces. So far it seems like SF stomping on commit_work. (There is
the added complication that there are some chrome gpu-process tasks in
between SF and the display, including CrGpuMain (which really doesn't
want to be SCHED_FIFO when executing gl commands on behalf of
something unrelated to the compositor.. the deadline approach, IIUC,
might be the better option eventually for this?)
BR,
-R
>
> Thanks
>
> --
> Qais Yousef