45,46c45,46 < * Init task must be ok at boot for the ix86 as we will check its signals < * via the SMP irq return path. --- > * INIT_TASKS[0] must be ok at boot for the ix86; we will check its > * signals via the SMP irq return path. 54,57c54,56 < * The scheduler lock is protecting against multiple entry < * into the scheduling code, and doesn't need to worry < * about interrupts (because interrupts cannot call the < * scheduler). --- > * The TASKLIST_LOCK protects against multiple entries > * into the scheduling code. It does not need to worry > * about interrupts as interrupts cannot call the scheduler. 59,60c58,59 < * The run-queue lock locks the parts that actually access < * and change the run-queues, and have to be interrupt-safe. --- > * RUNQUEUE_LOCK protects the code that actually accesses > * and changes run-queues and needs to be interrupt-safe. 68,69c67,68 < * We align per-CPU scheduling data on cacheline boundaries, < * to prevent cacheline ping-pong. --- > * By aligning the per-CPU scheduling data on cacheline > * boundaries we prevent cache banging. 98,101c97,101 < * This is the function that decides how desirable a process is.. < * You can weigh different processes against each other depending < * on what CPU they've run on lately etc to try to handle cache < * and TLB miss penalties. --- > * goodness determines how desirable it is to run a process > * on a particular cpu. The following factors are taken into > * account: what CPU the process was on last: this lowers the > * number of cache and tlb miss penalties, and if the process > * is marked as a realtime process. 111c111,112 < static inline int goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm) --- > static inline int goodness(struct task_struct * p, int this_cpu, > struct mm_struct *this_mm) 116,118c117,119 < * Realtime process, select the first one on the < * runqueue (taking priorities within processes < * into account). --- > * If this is a realtime process then select > * the first one on the runqueue. Still takes > * priorities within processes into account. 129,130c130,131 < * Don't do any other calculations if the time slice is < * over.. --- > * If the time slice is over, there is no need to > * do other calculations. 137,138c138,142 < /* Give a largish advantage to the same processor... */ < /* (this is equivalent to penalizing other processors) */ --- > /* > * Give a largish advantage to the same processor. > * Note that this is equivalent to penalizing other > * processors. > */ 143c147,150 < /* .. and a slight advantage to the current MM */ --- > /* > * Add a slight advantage if the process has the > * for the current MM. > */ 153,159c160,167 < * subtle. We want to discard a yielded process only if it's being < * considered for a reschedule. Wakeup-time 'queries' of the scheduling < * state do not count. Another optimization we do: sched_yield()-ed < * processes are runnable (and thus will be considered for scheduling) < * right when they are calling schedule(). So the only place we need < * to care about SCHED_YIELD is when we calculate the previous process' < * goodness ... --- > * This is subtle. > * > * It is desirable to discard a yielded process only if it's being > * considered for a reschedule; wakeup-time 'queries' of the scheduling > * state do not count. Additionally, sched_yield()-ed processes are > * runnable and thus will be considered for scheduling when they are > * calling schedule(). Therefore, this is the only place that > * cares about SCHED_YIELD. 161c169,170 < static inline int prev_goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm) --- > static inline int prev_goodness(struct task_struct * p, int this_cpu, > struct mm_struct *this_mm) 163a173 > /* Burn off the SCHED_YIELD flag */ 171,172c181,186 < * the 'goodness value' of replacing a process on a given CPU. < * positive value means 'replace', zero or negative means 'dont'. --- > * This calculates the 'goodness value' of replacing a process > * on a given CPU. > * > * Return values: > * <= 0 - do not replace > * > 0 - replace 174c188,189 < static inline int preemption_goodness(struct task_struct * prev, struct task_struct * p, int cpu) --- > static inline int preemption_goodness(struct task_struct * prev, > struct task_struct * p, int cpu) 176c191,192 < return goodness(p, cpu, prev->active_mm) - goodness(prev, cpu, prev->active_mm); --- > return goodness(p, cpu, prev->active_mm) > - goodness(prev, cpu, prev->active_mm); 180,183c196,204 < * This is ugly, but reschedule_idle() is very timing-critical. < * We enter with the runqueue spinlock held, but we might end < * up unlocking it early, so the caller must not unlock the < * runqueue, it's always done by reschedule_idle(). --- > * This is ugly, however, reschedule_idle() is very timing-critical. > * > * This function is entered with the runqueue spinlock held. As > * we might end up unlocking it early, the caller must not unlock > * the runqueue; unlocking the runqueue lock is always done by > * this function. > * > * Input: > * p is a task that was just woken up. 193,194c214,216 < * shortcut if the woken up task's last CPU is < * idle now. --- > * Since the task likely established a cache footprint > * on the last CPU that it was on put it there if that > * CPU is idle. 202,204c224,225 < * We know that the preferred CPU has a cache-affine current < * process, lets try to find a new idle CPU for the woken-up < * process: --- > * Since the preferred CPU has a cache-affined current > * process, try to find an idle CPU for the process. 212,214c233,235 < * We use the last available idle CPU. This creates < * a priority list between idle CPUs, but this is not < * a problem. --- > * The last available idle CPU is used. This creates > * a priority list between idle CPUs, however, this is > * not a problem. 221,222c242,243 < * No CPU is idle, but maybe this process has enough priority < * to preempt it's preferred CPU. --- > * No CPU is idle, however, this process may have enough > * priority to preempt the process on its preferred CPU. 230,233c251,256 < * case. No CPU is idle and this process is either lowprio or < * the preferred CPU is highprio. Try to preempt some other CPU < * only if it's RT or if it's iteractive and the preferred < * cpu won't reschedule shortly. --- > * case. No CPU is idle and this process is either of low > * priority or the preferred CPU is has a high priority > * process running on it. Attempt to preempt some other CPU > * only if this is a RT process or if this process is > * interactive and the preferred cpu will not reschedule > * within a reasonable amount of time. 235c258,259 < if (p->avg_slice < cacheflush_time || (p->policy & ~SCHED_YIELD) != SCHED_OTHER) { --- > if (p->avg_slice < cacheflush_time > || (p->policy & ~SCHED_YIELD) != SCHED_OTHER) { 254,255c278,279 < * the APIC stuff can go outside of the lock because < * it uses no task information, only CPU#. --- > * the APIC stuff can go outside of the lock as > * it uses no task information, only the CPU number. 260c284,287 < #else /* UP */ --- > > #else /* __SMP__ */ > /* Uniprocessor system */ > 268c295 < #endif --- > #endif /* __SMP__ */ 275,276c302,303 < * run-queue, not the end. See the comment about "This is < * subtle" in the scheduler proper.. --- > * run-queue, not the end. See the comment about > * "This is subtle" in the scheduler proper. 297,300c324,327 < * Wake up a process. Put it on the run-queue if it's not < * already there. The "current" process is always on the < * run-queue (except when the actual re-schedule is in < * progress), and as such you're allowed to do the simpler --- > * Wake up a process; if it is not already on the run queue > * then put it there. Note that the "current" process is > * always on the run-queue except when the actual re-schedule > * is in progress. Therefore, it is permissable to the simpler 309c336,337 < * We want the common case fall through straight, thus the goto. --- > * We want the common case to fall through straight, > * thus the goto. 316c344,346 < reschedule_idle(p, flags); // spin_unlocks runqueue --- > > /* reschedule_idle automatically spin_unlocks the runqueue */ > reschedule_idle(p, flags); 350a381,386 > /* > * This switch handles the two invalid cases > * that timeout may take, MAX_SCHEDULE_TIMEOUT > * and < 0. This is to make life a bit easier > * for the caller, nothing more. > */ 355,359c391,394 < * These two special cases are useful to be comfortable < * in the caller. Nothing more. We could take < * MAX_SCHEDULE_TIMEOUT from one of the negative value < * but I' d like to return a valid offset (>=0) to allow < * the caller to do everything it want with the retval. --- > * MAX_SCHEDULE_TIMEOUT could come from one of the negative > * values, but it would be nicer to return a valid offset > * (>=0) to allow the caller to do anything that it wants > * with the return value. 365,369c400,404 < * Another bit of PARANOID. Note that the retval will be < * 0 since no piece of kernel is supposed to do a check < * for a negative retval of schedule_timeout() (since it < * should never happens anyway). You just have the printk() < * that will tell you if something is gone wrong and where. --- > * More paranoia. > * > * The timeout should never be less than zero. > * Since no one checks for it, return 0; the only > * warning that this happened is the kprintf. 373c408 < printk(KERN_ERR "schedule_timeout: wrong timeout " --- > printk(KERN_ERR "schedule_timeout: invalid timeout " 391,392c426,427 < /* RED-PEN. Timer may be running now on another cpu. < * Pray that process will not exit enough fastly. --- > /* RED-PEN. Timer could now be running on another cpu. > * Pray that the process will not exit too quickly. 402c437 < * schedule_tail() is getting called from the fork return path. This --- > * schedule_tail() gets called from the fork return path. This 414c449 < reschedule_idle(prev, flags); // spin_unlocks runqueue --- > reschedule_idle(prev, flags); /* spin_unlocks runqueue */ 428c463 < * scheduler: it's not perfect, but certainly works for most things. --- > * scheduler: it's not perfect, however, it certainly works for most things. 430c465 < * The goto is "interesting". --- > * The gotos are "interesting". 432,433c467,468 < * NOTE!! Task 0 is the 'idle' task, which gets called when no other < * tasks can run. It can not be killed, and it cannot sleep. The 'state' --- > * NOTE!! Task 0 is the 'idle' task which, gets the CPU when no other > * tasks can run. It can neither be killed nor can it sleep. The 'state' 456c491 < /* Do "administrative" work here while we don't hold any locks */ --- > /* While we do not hold any locks, do some "administrative" work */ 462,463c497,498 < * 'sched_data' is protected by the fact that we can run < * only one process per CPU. --- > * Since we can only run a single process per CPU, 'sched_data' is > * implicitly protected. 487c522 < * this is the scheduler proper: --- > * The scheduler proper: 492c527 < * Default process to select.. --- > * The default process to select. 513,514c548,549 < * from this point on nothing can prevent us from < * switching to the next task, save this fact in --- > * From here on out, nothing can prevent a > * switch to the next task. Save this fact in 521c556 < #endif --- > #endif /* __SMP__ */ 529,533c564,568 < * maintain the per-process 'average timeslice' value. < * (this has to be recalculated even if we reschedule to < * the same process) Currently this is only used on SMP, < * and it's approximate, so we do not have to maintain < * it while holding the runqueue spinlock. --- > * Maintain the per-process 'average timeslice' value. > * This value must be recalculated even if we reschedule to > * the same process. Currently, this is only used on SMP. > * Since it is approximate, it does not have to be maintained > * while the runqueue spinlock is held. 543,545c578,580 < * Exponentially fading average calculation, with < * some weight so it doesnt get fooled easily by < * smaller irregularities. --- > * This is an exponentially fading average calculation > * with some weight so that it is not easily fooled by > * small irregularities. 551c586 < * We drop the scheduler lock early (it's a global spinlock), --- > * Drop the scheduler lock early (NOTE: it is a global spinlock), 557a593 > /* Number of context switches */ 558a595 > 560c597 < * there are 3 processes which are affected by a context switch: --- > * A context switch affects 3 processes: 564c601 < * It's the 'much more previous' 'prev' that is on next's stack, --- > * They are the 'much more previous' 'prev' that is on next's stack, 588,591c625 < /* < * This just switches the register state and the < * stack. < */ --- > /* This switches the register state and the stack. */ 777,779c811,813 < * Setpriority might change our priority at the same moment. < * We don't have to worry. Conceptually one call occurs first < * and we have a single winner. --- > * Setpriority might change the priority at the same moment. > * There is no need to worry; conceptually, one call occurs > * first and there is a single winner. 794,796c828,830 < * Unix nice values are -20 to 20; Linux doesn't really < * use that kind of thing, but uses the length of the < * timeslice instead (default 200 ms). The rounding is --- > * Unix nice values are -20 to 20, however, Linux doesn't > * really use that kind of thing. Instead, it uses the > * length of the timeslice (default 200 ms). The rounding is 804,809c838,844 < * Current->priority can change between this point < * and the assignment. We are assigning not doing add/subs < * so thats ok. Conceptually a process might just instantaneously < * read the value we stomp over. I don't think that is an issue < * unless posix makes it one. If so we can loop on changes < * to current->priority. --- > * Current->priority can change between this point > * and the assignment. Since this is only an assignment and > * not doing add/subs that is acceptable. Conceptually, a > * process might instantaneously read the value that is > * stomped over. This should not be an issue unless > * posix makes it one. If so, it can loop on changes > * to current->priority. 820c855 < #endif --- > #endif /* !__alpha__ */ 846,848c881 < /* < * We play safe to avoid deadlocks. < */ --- > /* Play it safe to avoid deadlocks. */ 1116,1117c1149,1150 < * Put all the gunge required to become a kernel thread without < * attached user resources in one place where it belongs. --- > * Put all the gunge required to become a kernel thread without > * attached user resources in one place, here, where it belongs. 1124d1156 < 1126,1128c1158,1160 < * If we were started as result of loading a module, close all of the < * user space pages. We don't need them, and if we didn't close them < * they would be locked into memory. --- > * If we were started as the result of loading a module, close all > * of the * user space pages; they are unneeded and if they were > * not closed, they would be locked into memory. 1135c1167 < /* Become as one with the init task */ --- > /* Become one with the init task */ 1161,1162c1193,1194 < * We have to do a little magic to get the first < * process right in SMP mode. --- > * Do a little magic to get the first process > * right in SMP mode.