Re: [RFT][PATCH v7 6/8] sched: idle: Select idle state before stopping the tick

From: Rafael J. Wysocki
Date: Wed Mar 28 2018 - 19:11:18 EST


On Wed, Mar 28, 2018 at 10:41 PM, Doug Smythies <dsmythies@xxxxxxxxx> wrote:
> On 2018.03.28 08:15 Thomas Ilsche wrote:
>> On 2018-03-28 12:56, Rafael J. Wysocki wrote:
>>> On Wed, Mar 28, 2018 at 12:37 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>>>> On Wed, Mar 28, 2018 at 10:38 AM, Thomas Ilsche
>>>> <thomas.ilsche@xxxxxxxxxxxxx> wrote:
>>>>> On 2018-03-28 10:13, Rafael J. Wysocki wrote:
>>>>>>
>>>
>>> [cut]
>>>
>>>>
>>>> So I do
>>>>
>>>> $ for cpu in 0 1 2 3; do taskset -c $cpu sh -c 'while true; do usleep
>>>> 500; done' & done
>>>>
>>>> which is a shell kind of imitation of the above and I cannot see this
>>>> issue at all.
>>>>
>>>> I count the number of times data->next_timer_us in menu_select() is
>>>> greater than TICK_USEC and while this "workload" is running, that
>>>> number is exactly 0.
>>>>
>>>> I'll try with a C program still.
>>>
>>> And with a C program I see data->next_timer_us greater than TICK_USEC
>>> while it is running, so let me dig deeper.
>>>
>>
>> I can confirm that a shell-loop behaves differently like you describe.
>> Even if it's just a shell-loop calling "main{usleep(500);}" binary.
>
> I normally use the C program method.
> The timer there returns with the need_sched() flag set.

I found the problem, but addressing it will not be straightforward,
which is kind of unfortunate.

Namely, get_next_timer_interrupt() doesn't take high resolution timers
into account if they are enabled (which I overlooked), but they
obviously need to be taken into account in
tick_nohz_get_sleep_length(), so calling tick_nohz_next_event() in
there is not sufficient.

Moreover, it needs to know the next highres timer not including the
tick and that's not so easy to get. It is doable, though, AFAICS.