Re: [PATCH] sched/cputime: Ensure correct utime and stime proportion

From: xunlei
Date: Thu Jul 05 2018 - 09:59:01 EST


On 7/5/18 9:42 PM, Peter Zijlstra wrote:
> On Thu, Jul 05, 2018 at 09:21:15PM +0800, Xunlei Pang wrote:
>> On 7/5/18 6:46 PM, Peter Zijlstra wrote:
>>> On Wed, Jun 27, 2018 at 08:22:42PM +0800, Xunlei Pang wrote:
>>>> tick-based whole utime is utime_0, tick-based whole stime
>>>> is stime_0, scheduler time is rtime_0.
>>>
>>>> For a long time, the process runs mainly in userspace with
>>>> run-sleep patterns, and because two different clocks, it
>>>> is possible to have the following condition:
>>>> rtime_0 < utime_0 (as with little stime_0)
>>>
>>> I don't follow... what?
>>>
>>> Why are you, and why do you think it makes sense to, compare rtime_0
>>> against utime_0 ?
>>>
>>> The [us]time_0 are, per your earlier definition, ticks. They're not an
>>> actual measure of time. Do not compare the two, that makes no bloody
>>> sense.
>>>
>>
>> [us]time_0 is task_struct:utime{stime}, I cited directly from
>> cputime_adjust(), both in nanoseconds. I assumed "rtime_0 < utime_0"
>> here to simple the following proof to help explain the problem we met.
>
> In the !VIRT_CPU_ACCOUNTING case they (task_struct::[us]time) are not
> actual durations. Yes, the happen to be accounted in multiples of
> TICK_NSEC and thereby happen to carry a [ns] unit, but they are not
> durations, they are samples.
>
> (we just happen to store them in a [ns] unit because for
> VIRT_CPU_ACCOUNTING they are in fact durations)
>
> If 'rtime < utime' is not a valid assumption to build a problem on for
> !VIRT_CPU_ACCOUNTING.
>

It is rtime < utime + stime, that is the imprecise tick-based run time
may be larger than precise sum_exec_runtime scheduler-based run time, it
can happen with some frequent run-sleep patterns.

Because stime is usually very small, so it is possible to have rtime <
utime.

>
> So please try again, so far you're not making any sense.
>

I also had a reproducer to verify this patch, can attach it tomorrow.