Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up()

From: Oleg Nesterov
Date: Tue Aug 26 2014 - 11:06:56 EST


On 08/26, Kautuk Consul wrote:
>
> I got one thing wrong:

Yes, your description was not accurate, but

> From some more code review, both __down_common() and
> do_wait_for_common() inspect the signal_pending() only while in
> TASK_RUNNING.

this doesn't really matter, or I missed something.

We have too much problems with this TASK_DEAD state. I have to admit that
I no longer understand why we do not need a barrier after spin_unlock_wait().

set_current_state(TASK_UNINTERRUPTIBLE);

__set_current_state(TASK_RUNNING);

// do_exit()

mb();
spin_unlock_wait();

tsk->state = TASK_DEAD;

schedule();

Previously I was convinced, but now I think that ttwu(TASK_UNINTERRUPTIBLE)
still can change TASK_DEAD into TASK_RUNNING if CPU reorders spin_unlock_wait
and "state = TASK_DEAD".

Perhaps I am wrong and in any case we can fix this but there another problem,
in theory finish_task_switch() can race with RUNNING -> DEAD transition.

So I still think that the (incomplete) patch I sent probably makes sense, even
if it adds the ugly rq->dead check into __schedule().

Let's wait for Peter.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/