Re: [PATCH 2.5.44] Simple NUMA Scheduler rev4 (2/2)

From: Michael Hohnbaum (hohnbaum@us.ibm.com)
Date: Wed Oct 23 2002 - 15:22:03 EST


On Wed, 2002-10-23 at 10:53, Michael Hohnbaum wrote:
> Attached is my simple NUMA scheduler patch. It has been split into
> two separate patches,

Here is the second piece of the NUMA scheduler patch. This patch adds
logic to exec to place the exec'd task on the least loaded runqueue.

This only effects kernels built with CONFIG_NUMA.

Please consider for inclusion in 2.5 kernel trees.

-- 

Michael Hohnbaum 503-578-5486 hohnbaum@us.ibm.com T/L 775-5486

--- clean-2.5.44/kernel/sched.c Wed Oct 23 10:03:50 2002 +++ linux-2.5.44/kernel/sched.c Wed Oct 23 10:03:59 2002 @@ -33,6 +33,7 @@ #include <linux/timer.h> #include <linux/rcupdate.h> #include <asm/topology.h> +#include <linux/percpu.h> /* * Convert user-nice values [ -20 ... 0 ... 19 ] @@ -2134,6 +2135,81 @@ __init int migration_init(void) #endif +#if CONFIG_NUMA +/* + * If dest_cpu is allowed for this process, migrate the task to it. + * This is accomplished by forcing the cpu_allowed mask to only + * allow dest_cpu, which will force the cpu onto dest_cpu. Then + * the cpu_allowed mask is restored. + */ +static void sched_migrate_task(task_t *p, int dest_cpu) +{ + unsigned long old_mask; + + old_mask = p->cpus_allowed; + if (!(old_mask & (1UL << dest_cpu))) + return; + /* force the process onto the specified CPU */ + set_cpus_allowed(p, 1UL << dest_cpu); + + /* restore the cpus allowed mask */ + set_cpus_allowed(p, old_mask); +} + +/* + * keep track of the last cpu that we exec'd from - use of this + * can be "fuzzy" as multiple procs can grab this at more or less + * the same time and set it similarly. Those situations will + * balance out on a heavily loaded system (where they are more + * likely to occur) quite rapidly + */ +static DEFINE_PER_CPU(int, last_exec_cpu) = 0; +/* + * Find the least loaded CPU. Slightly favor the current CPU by + * setting its runqueue length as the minimum to start. If + * current is lightly loaded, just stick with it. + */ +static int sched_best_cpu(struct task_struct *p) +{ + int i, minload, best_cpu, cur_cpu, node; + best_cpu = task_cpu(p); + if (cpu_rq(best_cpu)->nr_running <= 2) + return best_cpu; + + node = __cpu_to_node(__get_cpu_var(last_exec_cpu)); + if (++node >= numnodes) + node = 0; + + cur_cpu = __node_to_first_cpu(node); + minload = cpu_rq(best_cpu)->nr_running; + + for (i = 0; i < NR_CPUS; i++) { + if (!cpu_online(cur_cpu)) + continue; + + if (minload > cpu_rq(cur_cpu)->nr_running) { + minload = cpu_rq(cur_cpu)->nr_running; + best_cpu = cur_cpu; + } + if (++cur_cpu >= NR_CPUS) + cur_cpu = 0; + } + __get_cpu_var(last_exec_cpu) = best_cpu; + return best_cpu; +} + +void sched_balance_exec(void) +{ + int new_cpu; + + if (numnodes > 1) { + new_cpu = sched_best_cpu(current); + if (new_cpu != smp_processor_id()) + sched_migrate_task(current, new_cpu); + } +} +#endif /* CONFIG_NUMA */ + #if CONFIG_SMP || CONFIG_PREEMPT /* * The 'big kernel lock' --- clean-2.5.44/fs/exec.c Tue Oct 22 13:53:08 2002 +++ linux-2.5.44/fs/exec.c Tue Oct 22 10:32:19 2002 @@ -1001,6 +1001,8 @@ int do_execve(char * filename, char ** a int retval; int i; + sched_balance_exec(); + file = open_exec(filename); retval = PTR_ERR(file); --- clean-2.5.44/include/linux/sched.h Tue Oct 22 13:53:41 2002 +++ linux-2.5.44/include/linux/sched.h Tue Oct 22 10:32:19 2002 @@ -156,6 +156,11 @@ extern void update_one_process(struct ta unsigned long system, int cpu); extern void scheduler_tick(int user_tick, int system); extern unsigned long cache_decay_ticks; +#ifdef CONFIG_NUMA +extern void sched_balance_exec(void); +#else +#define sched_balance_exec() {} +#endif #define MAX_SCHEDULE_TIMEOUT LONG_MAX

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Oct 23 2002 - 22:01:06 EST