Re: [PATCH] skip increamenting nr for TASK_UNINTERRUPTIBLE

From: Oleg Nesterov
Date: Sun Dec 22 2013 - 10:08:40 EST


Vaibhav,

again, I think that everything was explained by Linus, let me
add some details.

> > In coredump case, where thread_1 faults while thread_2 is in
> > TASK_UNINTERRUPTIBLE state, it cannot handle the SIGKILL.
> > Thus the process hangs on event.
> > The coredump routine freezes until the thread state is
> > uninterruptible.

Yes. But why we should even try to "fix" coredump in this case?

> > Solution: Continue for coredump, without waiting for uninterruptible
> > thread,

This can't work, please see below.

> > as it will get killed as soon as it returns from
> > uninterruptible state.

Not necessarily. It can play with ->mm before it notices the pending
SIGKILL. And, if nothing else, the coredumping paths do not even take
mmap_sem because we assume that the dumper is the only user.

But even if this doesn't happen,

> > Therefore do not increament thread count for threads with
> > TASK_UNINTERRUPTIBLE.

This is very wrong too. This means that we can start the coredump before
the _accounted_ thread exits (because a skipped thread can exit first and
decrement the counter). This also means that coredump_finish() can race
with the unaccounted threads.

> > sigaddset(&t->pending.signal, SIGKILL);
> > signal_wake_up(t, 1);
> > - nr++;
> > + if(!(t->state & TASK_UNINTERRUPTIBLE))
> > + nr++;

Again, we can't simply check t->state & TASK_UNINTERRUPTIBLE. This can
be false positive or it can sleep in TASK_UNINTERRUPTIBLE right after
the check. And even "& TASK_UNINTERRUPTIBLE" is wrong, please look at
TASK_KILLABLE.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/