Re: Renice X for cpu schedulers

From: hui
Date: Fri Apr 20 2007 - 01:26:27 EST


On Thu, Apr 19, 2007 at 06:32:15PM -0700, Michael K. Edwards wrote:
> But I think SCHED_FIFO on a chain of tasks is fundamentally not the
> right way to handle low audio latency. The object with a low latency
> requirement isn't the task, it's the device. When it's starting to
> get urgent to deliver more data to the device, the task that it's
> waiting on should slide up the urgency scale; and if it's waiting on
> something else, that something else should slide up the scale; and so
> forth. Similarly, responding to user input is urgent; so when user
> input is available (by whatever mechanism), the task that's waiting
> for it should slide up the urgency scale, etc.

DSP operations like, particularly with digital synthesis, tend to max
the CPU doing vector operations on as many processors as it can get
a hold of. In a live performance critical application, it's important
to be able to deliver a protected amount of CPU to a thread doing that
work as well as response to external input such as controllers, etc...

> In practice, you probably don't want to burden desktop Linux with
> priority inheritance where you don't have to. Priority queues with
> algorithmically efficient decrease-key operations (Fibonacci heaps and
> their ilk) are complicated to implement and have correspondingly high
> constant factors. (However, a sufficiently clever heuristic for
> assigning quasi-static task priorities would usually short-circuit the
> priority cascade; if you can keep N small in the
> tasks-with-unpredictable-priority queue, you can probably use a
> simpler flavor with O(log N) decrease-key. Ask someone who knows more
> about data structures than I do.)

These are app issue and not really somethings that's mutable in kernel
per se with regard to the -rt patch.

> More importantly, non-real-time application coders aren't very smart
> about grouping data structure accesses on one side or the other of a
> system call that is likely to release a lock and let something else
> run, flushing application data out of cache. (Kernel coders aren't
> always smart about this either; see LKML threads a few weeks ago about
> racy, and cache-stall-prone, f_pos handling in VFS.) So switching
> tasks immediately on lock release is usually the wrong thing to do if
> letting the task run a little longer would allow it to reach a point
> where it has to block anyway.

I have Solaris style adaptive locks in my tree with my lockstat patch
under -rt. I've also modified my lockstat patch to track readers
correctly now with rwsem and the like to see where the single reader
limitation in the rtmutex blows it.

So far I've seen less than 10 percent of in-kernel contention events
actually worth spinning on and the rest of the stats imply that the
mutex owner in question is either preempted or blocked on something
else.

I've been trying to get folks to try this on a larger machine than my
2x AMD64 box so that I there is more data regarding Linux contention
and overschedulling in -rt.

> Anyway, I already described the urgency-driven strategy to the extent
> that I've thought it out, elsewhere in this thread. I only held this
> draft back because I wanted to double-check my latency measurements.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/