Re: [PATCH 0/3] drm: commit_work scheduling

From: Qais Yousef
Date: Mon Sep 21 2020 - 12:10:28 EST


On 09/19/20 12:37, Rob Clark wrote:
> From: Rob Clark <robdclark@xxxxxxxxxxxx>
>
> The android userspace treats the display pipeline as a realtime problem.
> And arguably, if your goal is to not miss frame deadlines (ie. vblank),
> it is. (See https://lwn.net/Articles/809545/ for the best explaination
> that I found.)
>
> But this presents a problem with using workqueues for non-blocking
> atomic commit_work(), because the SCHED_FIFO userspace thread(s) can
> preempt the worker. Which is not really the outcome you want.. once
> the required fences are scheduled, you want to push the atomic commit
> down to hw ASAP.
>
> But the decision of whether commit_work should be RT or not really
> depends on what userspace is doing. For a pure CFS userspace display
> pipeline, commit_work() should remain SCHED_NORMAL.

Just a side note; this RT vs CFS inter-operatability is an issue that
creeps up every now and again.

https://lore.kernel.org/lkml/1567048502-6064-1-git-send-email-jing-ting.wu@xxxxxxxxxxxx/

Does the UI thread in Android ever run as RT by the way? I always suspected it
is one susceptible to such potential delays since it is part of the application
thread and thought it can't be trusted to become RT.

Those 120MHz displays will stress the pipeline :-)

>
> To handle this, convert non-blocking commit_work() to use per-CRTC
> kthread workers, instead of system_unbound_wq. Per-CRTC workers are
> used to avoid serializing commits when userspace is using a per-CRTC
> update loop.
>
> A client-cap is introduced so that userspace can opt-in to SCHED_FIFO
> priority commit work.
>
> A potential issue is that since 616d91b68cd ("sched: Remove
> sched_setscheduler*() EXPORTs") we have limited RT priority levels,
> meaning that commit_work() ends up running at the same priority level
> as vblank-work. This shouldn't be a big problem *yet*, due to limited
> use of vblank-work at this point. And if it could be arranged that
> vblank-work is scheduled before signaling out-fences and/or sending
> pageflip events, it could probably work ok to use a single priority
> level for both commit-work and vblank-work.

This is a function of num_cpus too. As long as nr_cpus > nr_running_rt_tasks
you should be fine.

Cheers

--
Qais Yousef