Re: [BUG] mm_struct leak on cpu hotplug (s390/ppc64)

From: Nathan Lynch
Date: Wed Jan 05 2005 - 11:01:52 EST


On Wed, 2005-01-05 at 12:08 +0100, Ingo Molnar wrote:
> the correct solution i think would be to call back into the scheduler
> from cpu_die():
>
> void cpu_die(void)
> {
> if (ppc_md.cpu_die)
> ppc_md.cpu_die();
> + idle_task_exit();
> local_irq_disable();
> for (;;);
> }
>
> and then in idle_task_exit(), do something like:
>
> void idle_task_exit(void)
> {
> struct mm_struct *mm = current->active_mm;
>
> if (mm != &init_mm)
> switch_mm(mm, &init_mm, current);
> mmdrop(mm);
> }
>
> (completely untested.) This makes sure that the idle task uses the
> init_mm (which always has valid pagetables), and also ensures correct
> reference-counting. Hm?

OK, how's this? I'll submit the sched and ppc64 bits separately if
there are no objections. I assume Heiko can take care of s390.

Note that in the ppc64 cpu_die function we must call idle_task_exit
before calling ppc_md.cpu_die, because the latter does not return.

Nathan


Index: 2.6.10/include/linux/sched.h
===================================================================
--- 2.6.10.orig/include/linux/sched.h 2004-12-24 21:33:59.000000000 +0000
+++ 2.6.10/include/linux/sched.h 2005-01-05 14:30:39.000000000 +0000
@@ -735,6 +735,7 @@
#endif

extern void sched_idle_next(void);
+extern void idle_task_exit(void);
extern void set_user_nice(task_t *p, long nice);
extern int task_prio(const task_t *p);
extern int task_nice(const task_t *p);
Index: 2.6.10/kernel/sched.c
===================================================================
--- 2.6.10.orig/kernel/sched.c 2004-12-24 21:35:24.000000000 +0000
+++ 2.6.10/kernel/sched.c 2005-01-05 14:36:40.000000000 +0000
@@ -3995,6 +3995,20 @@
spin_unlock_irqrestore(&rq->lock, flags);
}

+/* Ensures that the idle task is using init_mm right before its cpu goes
+ * offline.
+ */
+void idle_task_exit(void)
+{
+ struct mm_struct *mm = current->active_mm;
+
+ BUG_ON(cpu_online(smp_processor_id()));
+
+ if (mm != &init_mm)
+ switch_mm(mm, &init_mm, current);
+ mmdrop(mm);
+}
+
static void migrate_dead(unsigned int dead_cpu, task_t *tsk)
{
struct runqueue *rq = cpu_rq(dead_cpu);
Index: 2.6.10/arch/ppc64/kernel/setup.c
===================================================================
--- 2.6.10.orig/arch/ppc64/kernel/setup.c 2004-12-24 21:34:44.000000000 +0000
+++ 2.6.10/arch/ppc64/kernel/setup.c 2005-01-05 14:39:43.000000000 +0000
@@ -1337,6 +1337,7 @@

void cpu_die(void)
{
+ idle_task_exit();
if (ppc_md.cpu_die)
ppc_md.cpu_die();
local_irq_disable();


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/