Re: INFO: rcu detected stall in do_idle

From: Juri Lelli
Date: Tue Oct 30 2018 - 07:12:31 EST


On 30/10/18 11:45, Peter Zijlstra wrote:

[...]

> Hurm.. right. We knew of this issue back when we did it.
> I suppose now it hurts and we need to figure something out.
>
> By virtue of being a real-time class, we do indeed need to have deadline
> on the wall-clock. But if we then don't account runtime on that same
> clock, but on a potentially slower clock, we get the problem that we can
> run longer than our period/deadline, which is what we're running into
> here I suppose.
>
> And yes, at some point RT workloads need to be aware of the jitter
> injected by things like IRQs and such. But I believe the rationale was
> that for soft real-time workloads this current semantic was 'easier'
> because we get to ignore IRQ overhead for workload estimation etc.

Right. In this case the task is self injecting IRQ load, but it maybe
doesn't make a big difference on how we need to treat it (supposing we
can actually distinguish).

> What we could maybe do is track runtime in both rq_clock_task() and
> rq_clock() and detect where the rq_clock based one exceeds the period
> and then push out the deadline (and add runtime).
>
> Maybe something along such lines; does that make sense?

Yeah, I think I've got the gist of the idea. I'll play with it.

Thanks,

- Juri