TASK_WAKEKILL && /sbin/init (was: [PATCH 1/2] schedule: fix TASK_WAKEKILL vs SIGKILL race)

From: Oleg Nesterov
Date: Thu Jun 05 2008 - 11:22:15 EST


Sorry Matthew, I left this part unanswered because I didn't have the
time yesterday...

On 06/04, Matthew Wilcox wrote:
>
> On Wed, Jun 04, 2008 at 09:09:05PM +0400, Oleg Nesterov wrote:
> > Note also that with or without this patch TASK_WAKEKILL is not exactly right
> > wrt /sbin/init, but this is another issue.
>
> That's certainly an interesting conversation to have.

If lock_page_killable() fails because the task was killed by SIGKILL or another
fatal signal, do_generic_file_read() returns -EIO.

This seems to be OK, because in fact the userspace won't see this error, the
task will dequeue SIGKILL and exit.

However, /sbin/init is different, it will dequeue SIGKILL, ignore it, and be
confused by this bogus -EIO. Please note that while this bug is not likely,
it is _not_ theoretical. It does happen that user-space sends the unhandled
fatal signals to init.

Imho, this is 2.6.26 material. Unless I missed something, of course.

It is not clear to me what should we do. I'd like very much to avoid adding
more SIGNAL_UNKILLABLE checks, but perhaps we don't have another choice.
We can fix the bug with

--- kernel/signal.c
+++ kernel/signal.c
@@ -974,7 +974,7 @@ void zap_other_threads(struct task_struc

int fastcall __fatal_signal_pending(struct task_struct *tsk)
{
- return sigismember(&tsk->pending.signal, SIGKILL);
+ return signal_group_exit(tsk->signal);
}

, but this makes __fatal_signal_pending() slower, and because we use
tsk->signal, schedule() (in particular) can't use this helper.

Anyway. How about the (untested/uncompiled) patch for now? -EINTR or
-ERESTARTNOINTR looks "more correct" regardless.

Oleg.

--- mm/filemap.c
+++ mm/filemap.c
@@ -188,7 +188,7 @@ static int sync_page(void *word)
static int sync_page_killable(void *word)
{
sync_page(word);
- return fatal_signal_pending(current) ? -EINTR : 0;
+ return fatal_signal_pending(current) ? -ERESTARTNOINTR : 0;
}

/**
@@ -1000,8 +1000,9 @@ page_ok:

page_not_up_to_date:
/* Get exclusive access to the page ... */
- if (lock_page_killable(page))
- goto readpage_eio;
+ error = lock_page_killable(page);
+ if (error)
+ goto readpage_error;

/* Did it get truncated before we got the lock? */
if (!page->mapping) {
@@ -1029,8 +1030,9 @@ readpage:
}

if (!PageUptodate(page)) {
- if (lock_page_killable(page))
- goto readpage_eio;
+ error = lock_page_killable(page);
+ if (error)
+ goto readpage_error;
if (!PageUptodate(page)) {
if (page->mapping == NULL) {
/*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/