Re: sched/fair: scheduler not running high priority process on idle cpu

From: Steven Rostedt
Date: Tue Jan 14 2020 - 12:48:17 EST


On Tue, 14 Jan 2020 17:33:50 +0000
David Laight <David.Laight@xxxxxxxxxx> wrote:

> I have added a cond_resched() to the offending loop, but a close look implies
> that code is called with a lock held in another (less common) path so that
> can't be directly committed and so CONFIG_PREEMPT won't help.
>
> Indeed requiring CONFIG_PREEMPT doesn't help when customers are running
> the application, nor (probably) on AWS since I doubt it is ever the default.
>
> Does the same apply to non-RT tasks?
> I can select almost any priority, but RT ones are otherwise a lot better.
>
> I've also seen RT processes delayed by the network stack 'bh' that runs
> in a softint from the hardware interrupt.
> That can take a while (clearing up tx and refilling rx) and I don't think we
> have any control over the cpu it runs on?

Yes, even with CONFIG_PREEMPT, Linux has no guarantees of latency for
any task regardless of priority. If you have latency requirements, then
you need to apply the PREEMPT_RT patch (which may soon make it to
mainline this year!), which spin locks and bh wont stop a task from
scheduling (unless they need the same lock).

>
> The cost of ftrace function call entry/exit (about 200 clocks) makes it
> rather unsuitable for any performance measurements unless only
> a very few functions are traced - which rather requires you know
> what the code is doing :-(
>

Well, when I use function tracing, I start all of them, analyze the
trace, then the functions I don't care about (usually spin locks and
other utils), I add to the set_ftrace_notrace file, which keeps them
from being part of the trace. I keep doing this until I find a set of
functions that doesn't hurt overhead as much and gives me enough
information to know what is happening. It also helps to enable all or
most events (at least scheduling events).

-- Steve