Re: waitpid advice ? (killing off a kernel thread)

B. James Phillippe (bryan@terran.org)
Wed, 23 Jun 1999 15:58:09 -0700 (PDT)


On Sat, 19 Jun 1999, David Waite wrote:

> On Sat, 19 Jun 1999, B. James Phillippe wrote:
> > On Sat, 19 Jun 1999, David Waite wrote:
> >
> > > The problem is that the drivers have threads running that are not dying
> > > before I start deallocating resources.
> > ...
> > > Anyways, my question is.. obviously my method of killing and then waiting
> > > for the process to die is not working right. What can I do instead? Wait

Hello,

Let me summarize the problem: we would like a way to reliably write a
module that starts a kernel_thread() in init_module() and kills the thread
in cleanup_module(). The problem arises when cleanup_module() races the
exiting kernel_thread(). If the thread wins, everything is okay. If the
cleanup_module() wins, the resources allocated by the module are removed
before the thread is gone and therefore results in a kernel Oops.

Here is a patch that seems to solve the problem by allowing cleanup_module
to sleep until the process is released. The existing wait_chldexit wait
queue in task_struct is not sufficient because it wakes only waiting
parents, and even then wakes them as soon as the child becomes a zombie
(and is therefore still around).

To use the patch, do a *sleep_on(&task->wait_exit) in the cleanup_module.

There is a potential problem with my patch, and that is if there is a race
between wake_up and release. What do you think?

Index: include/linux/sched.h
===================================================================
RCS file: /v/CVS-kernel/linux-2.2/include/linux/sched.h,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 sched.h
--- include/linux/sched.h 1999/06/15 22:48:58 1.1.1.1
+++ include/linux/sched.h 1999/06/23 04:49:56
@@ -268,6 +268,7 @@
struct task_struct **tarray_ptr;

struct wait_queue *wait_chldexit; /* for wait4() */
+ struct wait_queue *wait_exit;
struct semaphore *vfork_sem; /* for vfork() */
unsigned long policy, rt_priority;
unsigned long it_real_value, it_prof_value, it_virt_value;
@@ -356,7 +357,7 @@
/* proc links*/ &init_task,&init_task,NULL,NULL,NULL, \
/* pidhash */ NULL, NULL, \
/* tarray */ &task[0], \
-/* chld wait */ NULL, NULL, \
+/* chld wait */ NULL, NULL, NULL, \
/* timeout */ SCHED_OTHER,0,0,0,0,0,0,0, \
/* timer */ { NULL, NULL, 0, 0, it_real_fn }, \
/* utime */ {0,0,0,0},0, \
Index: kernel/exit.c
===================================================================
RCS file: /v/CVS-kernel/linux-2.2/kernel/exit.c,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 exit.c
--- kernel/exit.c 1999/06/15 22:48:56 1.1.1.1
+++ kernel/exit.c 1999/06/23 22:01:06
@@ -463,8 +463,11 @@
SET_LINKS(p);
write_unlock_irq(&tasklist_lock);
notify_parent(p, SIGCHLD);
- } else
+ } else {
+ if (waitqueue_active(&p->wait_exit))
+ wake_up(&p->wait_exit);
release(p);
+ }
#ifdef DEBUG_PROC_TREE
audit_ptree();
#endif
Index: kernel/fork.c
===================================================================
RCS file: /v/CVS-kernel/linux-2.2/kernel/fork.c,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 fork.c
--- kernel/fork.c 1999/06/15 22:48:56 1.1.1.1
+++ kernel/fork.c 1999/06/23 21:12:11
@@ -590,6 +590,7 @@
p->p_pptr = p->p_opptr = current;
p->p_cptr = NULL;
init_waitqueue(&p->wait_chldexit);
+ init_waitqueue(&p->wait_exit);
p->vfork_sem = NULL;

p->sigpending = 0;

-bp

--
# bryan at terran dot org
# http://www.terran.org/~bryan

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/