Re: Why can't I set the priority of softirq-hrt? (Re: 2.6.17-rt1)

From: Steven Rostedt
Date: Tue Jun 20 2006 - 12:38:34 EST



On Tue, 20 Jun 2006, Esben Nielsen wrote:

> I am sorry. I should have read some more of the code before asking.
>
> The only question I have is why the priority of the callback is set to
> priority of the task calling hrtimer_start() (current->normal_prio). That
> seems like an odd binding to me. Shouldn't the finding of the priority be moved over to the
> posix-timer code, where it is needed, and be given as a parameter to
> hrtimer_start()?
> In rtmutex.c, where a hrtimer is used as a timeout on a mutex, wouldn't it
> make more sense to use current->prio than current->normal_prio if the task
> is boosted when it starts to wait on a mutex.

That seems reasonable. It probably is a bug to use normal_prio, since we
really do care what prio is at that time.

>
>
> But I am not sure I like the design at all:
>
> Let say you have a bunch of callback running at priority 1 and then the
> next hrt timer with priority 99 expires. Then the callback which
> is running will be boosted to priority 99. So the overall latency at
> priority 99 will at least the latency of the worst hrtimer callback.

You mean for those that expire at the same time?

I don't think this is a problem, because the run_hrtimer_hres_queue runs
the hightest priorty callback first, then it adjusts its prio to the next
priority callback. See hrtimer_adjust_softirq_prio.

> And worse: What if the callback running is blocked on a mutex? Will the
> owner of the mutex be boosted as well? Not according to the code in
> sched.c. Therefore you get priority inversion to priority 1. That is the
> worst case hrtimer latency is that of priority 1.

I don't see this.

>
> Therefore, a simpler and more robust design would be to give the thread
> priority 99 as a default - just as the posix_cpu_timer thread. Then the
> system designer can move it around with chrt when needed.
> In fact you can say the current design have both the worst cases of having
> it running as priority 99 and at priority 1!

I still don't see this happening.

>
> Another complicated design would be to make a task for each priority.
> Then the interrupt wakes the highest priority one, which handles the first
> callback and awakes the next one etc.

Don't think that is necessary.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/