Re: fs: uninterruptible hang in handle_userfault

From: Linus Torvalds
Date: Wed Mar 02 2016 - 12:03:08 EST


On Wed, Mar 2, 2016 at 6:55 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote:
>
> Running page faults that late in the exit path with signal disabled
> was frankly unexpected.

I agree that it's less than wonderful.

> Apparently it's not just
> PF_EXITING that prevents SIGKILL to reach handle_userfault(). The
> below change still didn't allow to kill the task:
>
> + exit_futex(tsk); /* run before setting PF_EXITING */
> exit_signals(tsk); /* sets PF_EXITING */

It's not just "exit_futex()" (what is that? I assume you mean
exit_robust_list()) that triggers the problem, it's also the

put_user(0, tsk->clear_child_tid);

in mm_release().

So it's not just about futexes.

The might be other final user space accesses lurking too that I didn't
even think about.

Anyway, I committed (a) as the safest version with the least side
effects. If people think some more about this and come up with
solutions how to avoid these kinds of "very late user space accesses"
cleanly, I think that would be great.

Linus