[kernel-sched-cputime] question about probable bug in cputime_adjust()
From: Gustavo A. R. Silva
Date: Tue Jun 27 2017 - 19:04:02 EST
Hello everybody,
While looking into Coverity ID 1371643 I ran into the following piece
of code at kernel/sched/cputime.c:568:
568/*
569 * Adjust tick based cputime random precision against scheduler runtime
570 * accounting.
571 *
572 * Tick based cputime accounting depend on random scheduling
timeslices of a
573 * task to be interrupted or not by the timer. Depending on these
574 * circumstances, the number of these interrupts may be over or
575 * under-optimistic, matching the real user and system cputime with
a variable
576 * precision.
577 *
578 * Fix this by scaling these tick based values against the total runtime
579 * accounted by the CFS scheduler.
580 *
581 * This code provides the following guarantees:
582 *
583 * stime + utime == rtime
584 * stime_i+1 >= stime_i, utime_i+1 >= utime_i
585 *
586 * Assuming that rtime_i+1 >= rtime_i.
587 */
588static void cputime_adjust(struct task_cputime *curr,
589 struct prev_cputime *prev,
590 u64 *ut, u64 *st)
591{
592 u64 rtime, stime, utime;
593 unsigned long flags;
594
595 /* Serialize concurrent callers such that we can honour our
guarantees */
596 raw_spin_lock_irqsave(&prev->lock, flags);
597 rtime = curr->sum_exec_runtime;
598
599 /*
600 * This is possible under two circumstances:
601 * - rtime isn't monotonic after all (a bug);
602 * - we got reordered by the lock.
603 *
604 * In both cases this acts as a filter such that the rest
of the code
605 * can assume it is monotonic regardless of anything else.
606 */
607 if (prev->stime + prev->utime >= rtime)
608 goto out;
609
610 stime = curr->stime;
611 utime = curr->utime;
612
613 /*
614 * If either stime or both stime and utime are 0, assume
all runtime is
615 * userspace. Once a task gets some ticks, the monotonicy code at
616 * 'update' will ensure things converge to the observed ratio.
617 */
618 if (stime == 0) {
619 utime = rtime;
620 goto update;
621 }
622
623 if (utime == 0) {
624 stime = rtime;
625 goto update;
626 }
627
628 stime = scale_stime(stime, rtime, stime + utime);
629
630update:
631 /*
632 * Make sure stime doesn't go backwards; this preserves
monotonicity
633 * for utime because rtime is monotonic.
634 *
635 * utime_i+1 = rtime_i+1 - stime_i
636 * = rtime_i+1 - (rtime_i - utime_i)
637 * = (rtime_i+1 - rtime_i) + utime_i
638 * >= utime_i
639 */
640 if (stime < prev->stime)
641 stime = prev->stime;
642 utime = rtime - stime;
643
644 /*
645 * Make sure utime doesn't go backwards; this still preserves
646 * monotonicity for stime, analogous argument to above.
647 */
648 if (utime < prev->utime) {
649 utime = prev->utime;
650 stime = rtime - utime;
651 }
652
653 prev->stime = stime;
654 prev->utime = utime;
655out:
656 *ut = prev->utime;
657 *st = prev->stime;
658 raw_spin_unlock_irqrestore(&prev->lock, flags);
659}
The issue here is that the value assigned to variable utime at line
619 is overwritten at line 642, which would make such variable
assignment useless.
But I'm suspicious that such assignment is actually correct and that
line 642 should be included into the IF block at line 640. Something
similar to the following patch:
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -637,9 +637,10 @@ static void cputime_adjust(struct task_cputime *curr,
* = (rtime_i+1 - rtime_i) + utime_i
* >= utime_i
*/
- if (stime < prev->stime)
+ if (stime < prev->stime) {
stime = prev->stime;
- utime = rtime - stime;
+ utime = rtime - stime;
+ }
If you confirm this, I will send a patch in a full and proper form.
I'd really appreciate your comments.
Thank you!
--
Gustavo A. R. Silva