Re: [patch] Yenta module race condition

From: Andreas Dilger (adilger@turbolabs.com)
Date: Fri May 05 2000 - 14:24:35 EST


I wrote some code for a module that started a kernel thread. You can't
use module reference counts for the threads because rmmod does not let
you remove a module with a non-zero reference count. Hence, it is not
possible to have the module unload code kill the thread and the thread
do the DEC_MOD_USE_COUNT, as you never even get into the cleanup code
because of the non-zero reference count.

Similarly, it is not possible to have post-remove actions for the module,
as the threads are still running, and if you remove the module before
they are killed, they will oops on the non-existent code they are trying
to execute.

My thread slept, and was periodically awoken. It did the following:

                /* check if we are supposed to shut down */
                if (signal_pending(tsk))
                {
                        int stopped = 0;
                        spin_lock_irq(&tsk->sigmask_lock);
                        if (sigismember(&tsk->signal, SIGTERM))
                        {
                                sigdelset(&tsk->signal, SIGTERM);
                                stopped = 1;
                        }
                        recalc_sigpending(tsk);
                        spin_unlock_irq(&tsk->sigmask_lock);
                        if (stopped) {
                                printk("stopped...\n");
                                set_task_state(tsk, TASK_STOPPED);
                                tsk = NULL;
                                return 0;
                        }
                }

Inside the module cleanup code I did:

        /* deliver a signal to shut the thread down */
        if (tsk && (tsk->state == TASK_RUNNING ||
                    tsk->state == TASK_INTERRUPTIBLE )) {
                unsigned long timeout = HZ/20;
                unsigned long count = 0;
                send_sig_info(SIGTERM, (struct siginfo *)1, tsk);
                while (tsk) {
                        if ((count % 2*HZ) == timeout)
                                printk(KERN_INFO "wait for thread to stop\n");
                        count += timeout;
                        set_current_state(TASK_INTERRUPTIBLE);
                        schedule_timeout(timeout);
                }
        }

This way, the thread gets a chance to exit properly before the module
is unloaded. So far I have not noticed any problems with this code.
I suppose it would also be possible to force the thread to be scheduled
immediately after the signal is delivered, but that was beyond me, and
the thread would wake up by it self often enough that it wasn't a problem.

The reason I check the tsk->state is because during development when
the thread would oops, tsk was not set to NULL, and it was now pointing
to garbage. Trying to send a signal based in its contents would just
create another oops. In this case, since the thread is already dead,
it is also OK to unload the module. The thread was only ever in RUNNING
or INTERRUPTIBLE state.

Cheers, Andreas

-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun May 07 2000 - 21:00:18 EST