Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

From: Oleg Nesterov
Date: Thu Jun 28 2007 - 10:12:58 EST


(trimmed the cc: list)

On 06/28, Satyam Sharma wrote:
>
> On 6/27/07, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> >Contrary, I believe we should avoid signals when it
> >comes to kernel threads.
>
> And I agree, but there's quite a subtle difference between signals being
> used like they normally are, and this case here. Fact is, there is simply
> no other way to break out of certain functions (if there was, I would've
> preferred that myself).
>
> In fact, what I'm proposing is that kthreads should *not* be tinkering
> with (flushing, handling, dequeueing, whatever) signals at all, because
> like you mentioned, if they do that, it's possible that the TIF_SIGPENDING
> could get lost.

But we do have kthreads which call dequeue_signal(). And perhaps some
kthread treats SIGKILL as "urgent exit with a lot of printks" while
kthread_should_stop() means "exit gracefully".

> >I am talking about the case
> >when wait_event_interruptible() should not fail (unless something bad
> >happens) inside the "while (!kthread_should_stop())" loop.
> >Note also that kthread could use TASK_INTERRUPTIBLE sleep
> >[...] and because it knows that all signals are ignored.
>
> Ok, I think you're saying that if a kthread used wait_event_interruptible
> (and was not expecting signals, because it ignores them), then bad
> things (say exit in inconsistent state, etc) will happen if we break that
> wait_event_interruptible unexpectedly.

No. Of course, kthread should check the error and doesn't exit in
inconsistent state.

The question is: why should we break (say) tcp_recvmsg() inside
"while (!kthread_should_stop())" loop if it is supposed to succeed
unless something bad happens? (I mean, we may have a kthread which
doesn't expect the failure unless something bad happens).

OK, let me give a silly example. The correctly written kthread should check
the result of kmalloc(), yes? kthread(k) means that k should exit anyway, yes?
So let's change kthread_stop(k) to also set TIF_MEMDIE, this means that
__alloc_pages() will fail quickly when get_page_from_freelist() doesn't
succeed, but won't start pageout() which may take a while. Please don't
explain me why this suggestion is bad :), I am just trying to illustrate
my point.

I believe kthread_stop() should not send the signal. Just because it could
be actually dequeued by kthread, and it may have some special meaning for
this kthread.

Perhaps it makes sense to do signal_wake_up() (sets TIF_SIGPENDING and wakes
up), recalc_sigpending() could be changed to take kthread_should_stop() into
account (so TIF_SIGPENDING can't be cleared). Once again, I _personally_ do
not like this too much, but yes, I see your point, and I can't say such a
change is "wrong".

Hmm... actually, such a change breaks the

while (signal_pending(current))
dequeue_signal_and_so_something();

loop, see jffs2_garbage_collect_thread() for example.

In short, I think that kthread_stop() + TIF_SIGPENDING should be a special
case, the signal should be sent explicitly before kthread_stop(), like
cifs does. After all, the caller of kthread_stop(k) should know what and
why k does.

In any case, that kind of the changes can break things, just because
this means API change.

> And thirdly, what I'm proposing is putting the check for checking the
> SIGKILL in kthread_should_stop itself, in /addition/ to the
> kthread_stop_info.k == current check.

This is what I can't understand completely. Why should we check SIGKILL
or signal_pending() in addition to kthread_stop_info.k, what is the point?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/