Re: INFO: rcu detected stall in do_idle

From: luca abeni
Date: Thu Oct 18 2018 - 06:38:15 EST


Hi Juri,

On Thu, 18 Oct 2018 12:10:08 +0200
Juri Lelli <juri.lelli@xxxxxxxxxx> wrote:
[...]
> > Yes, a HZ related limit sounds like something we'd want. But if
> > we're going to do a minimum sysctl, we should also consider adding
> > a maximum, if you set a massive period/deadline, you can, even with
> > a relatively low u, incur significant delays.
> >
> > And do we want to put the limit on runtime or on period ?
> >
> > That is, something like:
> >
> > TICK_NSEC/2 < period < 10*TICK_NSEC
> >
> > and/or
> >
> > TICK_NSEC/2 < runtime < 10*TICK_NSEC
> >
> > Hmm, for HZ=1000 that ends up with a max period of 10ms, that's far
> > too low, 24Hz needs ~41ms. We can of course also limit the runtime
> > by capping u for users (as we should anyway).
>
> I also thought of TICK_NSEC/2 as a reasonably safe lower limit

I tend to think that something larger than "2" should be used (maybe
10? I mean: even if HZ = 100, it might make sense to allow a runtime
equal to 1ms...)


> that will implicitly limit period as well since
>
> runtime <= deadline <= period

I agree that if we end up with TICK_NSEC/2 for the runtime limit then
explicitly enforcing a minimum period is not needed.



> Not sure about the upper limit, though. Lower limit is something
> related to the inherent granularity of the platform/config, upper
> limit is more to do with highest prio stuff with huge period delaying
> everything else; doesn't seem to be related to HZ?

I agree


Luca