Re: Real-time scheduling policies and hyper-threading

From: Roman Gushchin
Date: Fri Apr 25 2014 - 07:05:03 EST


24.04.2014, 22:59, "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>:
> On Thu, Apr 24, 2014 at 10:16:12PM +0400, Roman Gushchin wrote:
>
>>  Are there any known solutions of this problem except disabling
>>  hyper-threading and frequency scaling at all?
>
> This is what we generally tell people to do; disable HT in the BIOS,
> offline the siblings or similar approaches.

It's a good, but expensive approach, if you have many machines.


> Similarly we typically tell people to keep each rt workload to a single
> NUMA node, and in case the workload really is bigger and must cross into
> multiple, use memory interleave where possible.
>
>>  Are there any common plans to enhance the load balancing algorithm in
>>  the rt-scheduler?
>
> Nope; at best the topology driven placement is going to be a heuristic,
> and the last thing you want for your realtime tasks is (non deterministic)
> placement heuristics.
>
> You, and only you, know your workload and can devise a correct placement
> policy. We have cpusets and cpu affinity available to carve up your
> system for this.

The problem here is that _all_ userspace tasks should care about cpu placement.
In reality, there is always a number of monitoring/administration/system tasks,
which execution increases latencies. Use of nice/SCHED_BATCH/... doesn't solve
the problem, because of hyper-threading.

Of course, it's possible to divide cpus statically via cpusets/partitioning,
but it's also not cheap in terms of overall performance utilization.


> That said, there might be something we could do for soft(er) realtime,
> but I don't think there's been much (if any) research on this.
>
>>  Does anyone use rt-scheduler for runtime-like cpu-bound tasks?
>
> So in general cpu bound tasks in the RT classes (FIFO/RR/DEADLINE) are
> bad and can make the system go funny.
>
> For general system health it is important that various system tasks
> (kthreads usually) can run. Many of these kthreads run at !rt prios, and
> by having cpu bound tasks in rt prios they don't get to run.

I also had expected a number of problems here, but actually we caught only
a tricky race in timers code ( https://lkml.org/lkml/2014/3/17/323 ).
We haven't noticed any such problems on several hundreds of machines
working in runtime production for few weeks.

Probably, it's important, that CPU load in our case is never 100% for a period
longer than few hundreds milliseconds.


Many thanks for you comments!

Regards,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/