[PATCH 05/19] sched/numa: Use task faults only if numa_group is not yet setup
From: Srikar Dronamraju
Date: Mon Jun 04 2018 - 06:04:42 EST
When numa_group faults are available, task_numa_placement only uses
numa_group faults to evaluate preferred node. However it still accounts
task faults and even evaluates the preferred node just based on task
faults just to discard it in favour of preferred node chosen on the
basis of numa_group.
Instead use task faults only if numa_group is not set.
Testcase Time: Min Max Avg StdDev
numa01.sh Real: 506.35 794.46 599.06 104.26
numa01.sh Sys: 150.37 223.56 195.99 24.94
numa01.sh User: 43450.69 61752.04 49281.50 6635.33
numa02.sh Real: 60.33 62.40 61.31 0.90
numa02.sh Sys: 18.12 31.66 24.28 5.89
numa02.sh User: 5203.91 5325.32 5260.29 49.98
numa03.sh Real: 696.47 853.62 745.80 57.28
numa03.sh Sys: 85.68 123.71 97.89 13.48
numa03.sh User: 55978.45 66418.63 59254.94 3737.97
numa04.sh Real: 444.05 514.83 497.06 26.85
numa04.sh Sys: 230.39 375.79 316.23 48.58
numa04.sh User: 35403.12 41004.10 39720.80 2163.08
numa05.sh Real: 423.09 460.41 439.57 13.92
numa05.sh Sys: 287.38 480.15 369.37 68.52
numa05.sh User: 34732.12 38016.80 36255.85 1070.51
Testcase Time: Min Max Avg StdDev %Change
numa01.sh Real: 478.45 565.90 515.11 30.87 16.29%
numa01.sh Sys: 207.79 271.04 232.94 21.33 -15.8%
numa01.sh User: 39763.93 47303.12 43210.73 2644.86 14.04%
numa02.sh Real: 60.00 61.46 60.78 0.49 0.871%
numa02.sh Sys: 15.71 25.31 20.69 3.42 17.35%
numa02.sh User: 5175.92 5265.86 5235.97 32.82 0.464%
numa03.sh Real: 776.42 834.85 806.01 23.22 -7.47%
numa03.sh Sys: 114.43 128.75 121.65 5.49 -19.5%
numa03.sh User: 60773.93 64855.25 62616.91 1576.39 -5.36%
numa04.sh Real: 456.93 511.95 482.91 20.88 2.930%
numa04.sh Sys: 178.09 460.89 356.86 94.58 -11.3%
numa04.sh User: 36312.09 42553.24 39623.21 2247.96 0.246%
numa05.sh Real: 393.98 493.48 436.61 35.59 0.677%
numa05.sh Sys: 164.49 329.15 265.87 61.78 38.92%
numa05.sh User: 33182.65 36654.53 35074.51 1187.71 3.368%
Ideally this change shouldn't have affected performance.
Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
---
kernel/sched/fair.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 94091e6..e7c07aa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2072,8 +2072,8 @@ static int preferred_group_nid(struct task_struct *p, int nid)
static void task_numa_placement(struct task_struct *p)
{
- int seq, nid, max_nid = -1, max_group_nid = -1;
- unsigned long max_faults = 0, max_group_faults = 0;
+ int seq, nid, max_nid = -1;
+ unsigned long max_faults = 0;
unsigned long fault_types[2] = { 0, 0 };
unsigned long total_faults;
u64 runtime, period;
@@ -2152,15 +2152,15 @@ static void task_numa_placement(struct task_struct *p)
}
}
- if (faults > max_faults) {
- max_faults = faults;
+ if (!p->numa_group) {
+ if (faults > max_faults) {
+ max_faults = faults;
+ max_nid = nid;
+ }
+ } else if (group_faults > max_faults) {
+ max_faults = group_faults;
max_nid = nid;
}
-
- if (group_faults > max_group_faults) {
- max_group_faults = group_faults;
- max_group_nid = nid;
- }
}
update_task_scan_period(p, fault_types[0], fault_types[1]);
@@ -2168,7 +2168,7 @@ static void task_numa_placement(struct task_struct *p)
if (p->numa_group) {
numa_group_count_active_nodes(p->numa_group);
spin_unlock_irq(group_lock);
- max_nid = preferred_group_nid(p, max_group_nid);
+ max_nid = preferred_group_nid(p, max_nid);
}
if (max_faults) {
--
1.8.3.1