[PATCH] sched/fair: add comments for group_type and balancing at SD_NUMA level

From: Vincent Guittot
Date: Tue Nov 12 2019 - 09:51:00 EST


Add comments to describe each state of goup_type and to add some details
about the load balance at NUMA level.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c93d534..268e441 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6986,11 +6986,34 @@ enum fbq_type { regular, remote, all };
* group. see update_sd_pick_busiest().
*/
enum group_type {
+ /*
+ * The group has spare capacity that can be used to process more work.
+ */
group_has_spare = 0,
+ /*
+ * The group is fully used and the tasks don't compete for more CPU
+ * cycles. Nevetheless, some tasks might wait before running.
+ */
group_fully_busy,
+ /*
+ * One task doesn't fit with CPU's capacity and must be migrated on a
+ * more powerful CPU.
+ */
group_misfit_task,
+ /*
+ * One local CPU with higher capacity is available and task should be
+ * migrated on it instead on current CPU.
+ */
group_asym_packing,
+ /*
+ * The tasks affinity prevents the scheduler to balance the load across
+ * the system.
+ */
group_imbalanced,
+ /*
+ * The CPU is overloaded and can't provide expected CPU cycles to all
+ * tasks.
+ */
group_overloaded
};

@@ -8608,7 +8631,11 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s

/*
* Try to use spare capacity of local group without overloading it or
- * emptying busiest
+ * emptying busiest.
+ * XXX Spreading tasks across numa nodes is not always the best policy
+ * and special cares should be taken for SD_NUMA domain level before
+ * spreading the tasks. For now, load_balance() fully relies on
+ * NUMA_BALANCING and fbq_classify_group/rq to overide the decision.
*/
if (local->group_type == group_has_spare) {
if (busiest->group_type > group_fully_busy) {
--
2.7.4