[no patch] broken use of mm_release / deactivate_mm
From: Andries Brouwer
Date: Mon Sep 13 2004 - 14:09:07 EST
Recent kernels have a bug in fork(). Things can be improved a bit
by commenting out the lines
/* Get rid of any cached register state */
deactivate_mm(tsk, mm);
in fork.c:mm_release().
What happens at a fork, is that a long sequence of things is done,
and if a failure occurs all previous things are undone. Thus
(in copy_process()):
if ((retval = copy_mm(clone_flags, p)))
goto bad_fork_cleanup_signal;
if ((retval = copy_namespace(clone_flags, p)))
goto bad_fork_cleanup_mm;
retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
if (retval)
goto bad_fork_cleanup_namespace;
...
bad_fork_cleanup_namespace:
exit_namespace(p);
bad_fork_cleanup_mm:
exit_mm(p);
if (p->active_mm)
mmdrop(p->active_mm);
bad_fork_cleanup_signal:
...
Thus, we may do exit_mm() when an attempted fork fails.
The argument of exit_mm() is this new, not completeley initialized
task_struct.
Now exit.c:exit_mm(p) does mm_release(p,p->mm), and this
mm_release() does deactivate_mm(), a macro that clears %fs and %gs.
Ach. A segfault is the result.
More is wrong with mm_release(). It examines p->clear_child_tid,
and possibly does put_user(0, tidptr);.
Oops.
In our case p->clear_child_tid had not yet been initialized for
the child, that happens in copy_thread() that we never reached.
So this is the value the parent had.
Also the call
enter_lazy_tlb(mm, current);
seems strange in this context.
It seems to me that the proper action is some reshuffling
of this code. Maybe it is cleanest to separate the cleanup
code for a failed fork entirely from the code for an exiting
process.
Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/