ping...
Any good suggestions?
thanks all.
在 2022/7/27 12:02, Lihua (lihua, ran) 写道:
Hi all,
I found a problem that the statistical time goes backward, the value read first is 319, and the value read again is 318. As follows:
first:
cat /proc/stat | grep cpu1
cpu1 319 0 496 41665 0 0 0 0 0 0
then:
cat /proc/stat | grep cpu1
cpu1 318 0 497 41674 0 0 0 0 0 0
Time goes back, which is counterintuitive.
After debug this, I found that the problem is caused by the implementation of kcpustat_cpu_fetch_vtime. As follows:
CPU0 CPU1
First:
show_stat():
->kcpustat_cpu_fetch()
->kcpustat_cpu_fetch_vtime()
->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu) + vtime->utime + delta; rq->curr is in user mod
---> When CPU1 rq->curr running on userspace, need add utime and delta
---> rq->curr->vtime->utime is less than 1 tick
Then:
show_stat():
->kcpustat_cpu_fetch()
->kcpustat_cpu_fetch_vtime()
->cpustat[CPUTIME_USER] = kcpustat_cpu(cpu); rq->curr is in kernel mod
---> When CPU1 rq->curr running on kernel space, just got kcpustat
Because the values of utime、 stime and delta are temporarily written to cpustat. Therefore, there are two problems read from /proc/stat:
1. There may be a regression phenomenon;
2. When there are many tasks, the statistics are not accurate enough when utime and stime do not exceed one TICK.
The time goes back is counterintuitive, and I want to discuss whether there is a good solution without compromising performance.
Thanks a lot.