Re: sched: odd values for effective load calculations

From: Sasha Levin
Date: Mon Dec 15 2014 - 23:52:30 EST

On 12/15/2014 07:12 AM, Peter Zijlstra wrote:
> Sorry for the long delay, I was out for a few weeks due to having become
> a dad for the second time.

Congrats! May you be able to sleep at night sooner rather than later.

> On Sat, Dec 13, 2014 at 09:30:12AM +0100, Ingo Molnar wrote:
>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote:
>>> Hi all,
>>> I was fuzzing with trinity inside a KVM tools guest, running the latest -next
>>> kernel along with the undefined behaviour sanitizer patch, and hit the following:
>>> [ 787.894288] ================================================================================
>>> [ 787.897074] UBSan: Undefined behaviour in kernel/sched/fair.c:4541:17
>>> [ 787.898981] signed integer overflow:
>>> [ 787.900066] 361516561629678 * 101500 cannot be represented in type 'long long int'
> So that's:
> this_eff_load *= this_load +
> effective_load(tg, this_cpu, weight, weight);
> Going by the numbers the 101500 must be 'this_eff_load', 100 * ~1024
> makes that. Which makes the rhs 'large'. Do you have
> CONFIG_FAIR_GROUP_SCHED enabled? If so, what kind of cgroup hierarchy
> are you using?

CONFIG_FAIR_GROUP_SCHED is enabled. There's no cgroup set-up initially,
but I figure that trinity is able to do crazy things here.

> In any case, bit sad this doesn't have a register dump included :/
> Is this easy to reproduce or something that happened once?

It's fairy reproducible, I've seen it happen quite a few times. What other
information might be useful?

>>> The values for effective load seem a bit off (and are overflowing!).
>> It definitely looks like a bug in SMP load balancing!
> Yeah, although theoretically (and somewhat practical) this can be
> triggered in more places if you manage to run up the 'weight' with
> enough tasks.
> That said, it should at worst result in 'funny' balancing behaviour, not
> anything else.

I'm not sure if you've caught up on the RCU stall issue we've been trying
to track down (, but could this "funny"
balancing behaviour be "funny" enough to cause a stall?


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at