Re: sched: odd values for effective load calculations

From: Yuyang Du
Date: Fri Dec 19 2014 - 03:28:20 EST


On Tue, Dec 16, 2014 at 10:09:48AM +0800, Yuyang Du wrote:
>
> Sasha, it might be helpful to see this_load is from:
>
> this_load1: this_load = target_load(this_cpu, idx);
>
> or
>
> this_load2: this_load += effective_load(tg, this_cpu, -weight, -weight);
>
> It really does not seem to be this_load1, while the calc of effective_load is a bit
> complicated to see what the problem is.

Hi all,

I finally managed to reproduce this, but not by trinity, just by keeping rebooting.

Indeed, the problem is from:

this_load2: this_load += effective_load(tg, this_cpu, -weight, -weight);

After digging into effective_load(), the root cause is:

wl = (w * tg->shares) / W;

if we have negative w, then it will be cast to unsigned long, and then may or may not
overflow, and end up an insane number.

I tried this in userspace, interestingly if we have:

wl = w * tg->shares;
wl /= W;

the result is ok, but not ok with the lines combined as the original one.

Anyway, the following patch can fix this.

---
Subject: [PATCH] sched: Fix long and unsigned long multiplication error in
effective_load

In effective_load, we have (long w * unsigned long tg->shares) / long W,
when w is negative, it is cast to unsigned long and hence the product is
insanely large. Fix this by casting tg->shares to long.

Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
Signed-off-by: Yuyang Du <yuyang.du@xxxxxxxxx>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index df2cdf7..6b99659 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4424,7 +4424,7 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg)
* wl = S * s'_i; see (2)
*/
if (W > 0 && w < W)
- wl = (w * tg->shares) / W;
+ wl = (w * (long)tg->shares) / W;
else
wl = tg->shares;

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/