Re: uninterruptible CLONE_VFORK (Was: oom: Make coredumpinterruptible)
From: Oleg Nesterov
Date: Mon Jun 14 2010 - 13:35:30 EST
On 06/13, Roland McGrath wrote:
>
> > Oh. And another problem, vfork() is not interruptible too. This means
> > that the user can hide the memory hog from oom-killer.
>
> I'm not sure there is really any danger like that, because of the
> oom_kill_process "Try to kill a child first" logic.
But note that oom_kill_process() doesn't kill the children with the
same ->mm. I never understood this code.
Anyway I agree. Even if I am right, this is not very serious problem
from oom-kill pov. To me, the uninterruptible CLONE_VFORK is bad by
itself.
> > But let's forget about oom.
>
> Sure, but it reminds me to mention that vfork mm sharing is another reason
> that having oom_kill set some persistent state in the mm seems wrong.
Yes, yes, this was already discussed a bit. Only if the core dump is in
progress we can touch ->mm or (probably better but needs a bit more locking)
mm->core_state to signal the coredumping thread and (perhaps) for something
else.
> > Roland, any reason it should be uninterruptible? This doesn't look good
> > in any case. Perhaps the pseudo-patch below makes sense?
>
> I've long thought that we should make a vfork parent SIGKILL-able.
Good ;)
> (Of
> course the vfork wait can't be made interruptible by other signals, since
> it must never do anything userish
Yes sure. That is why wait_for_completion_killable(), not _interrutpible.
But I assume you didn't mean that only SIGKILL should interrupt the
parent, any sig_fatal() signal should.
> I don't know off hand of any problem with your
> straightforward change. But I don't have much confidence that there isn't
> any strange gotcha waiting there due to some other kind of implicit
> assumption about vfork parent blocks that we are overlooking at the moment.
> So I wouldn't change this without more thorough auditing and thinking about
> everything related to vfork.
Agreed. This needs auditing. And CLONE_VFORK can be used with/without all
other CLONE_ flags... Probably we should mostly worry about vfork ==
CLONE_VM | CLONE_VFORK case.
Anyway. ->vfork_done is per-thread. This means that without any changes
do_fork(CLONE_VFORK) can return (to user-mode) before the child's thread
group exits/execs. Perhaps this means we shouldn't worry too much.
> Personally, what I've really been interested in is changing the vfork wait
> to use some different kind of blocking entirely. My real motivation for
> that is to let a vfork wait be morphed into and out of TASK_TRACED,
I see. I never thought about this, but I think you are right.
Hmm. Even without debugger, the parent doesn't react to SIGSTOP. Say,
int main(voif)
{
if (!vfork())
pause();
}
and ^Z won't work obviously. Not good.
This is not trivail I guess. Needs thinking...
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/