Re: NMI watchdog triggering during load_balance

From: David Ahern
Date: Fri Mar 06 2015 - 10:11:38 EST


On 3/6/15 2:07 AM, Peter Zijlstra wrote:
On Thu, Mar 05, 2015 at 09:05:28PM -0700, David Ahern wrote:
Since each domain is a superset of the lower one each pass through
load_balance regularly repeats the processing of the previous domain (e.g.,
NODE domain repeats the cpus in the CPU domain). Then multiplying that
across 1024 cpus and it seems like a of duplication.

It is, _but_ each domain has an interval, bigger domains _should_ load
balance at a bigger interval (iow lower frequency), and all this is
lockless data gathering, so reusing stuff from the previous round could
be quite stale indeed.


Yes and I have twiddled the intervals. The defaults for min_interval and max_interval (msec):
SMT 1 2
MC 1 4
CPU 1 4
NODE 8 32

Increasing those values (e.g. moving NODE to 50 and 100) drops idle time cpu usage but does not solve the fundamental problem -- under load the balancing of domains seems to be lining up and the system comes to a halt in load balancing frenzy.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/