SMP locking bug

Steve Tate (srt@silo.csci.unt.edu)
Fri, 11 Sep 1998 17:09:50 -0500 (CDT)


I've posted a few requests here about my problems with the 2.1.120
(now 2.1.121) kernel (SMP) crashing. I've found the problem, and it
turns out to be a problem in the core part of the kernel, and not in a
hardware driver at all.

I'll describe the problem first, and then give a proposed correction
with a patch. I will say this: I've hacked many Unix kernels, but
have relatively little experience with the Linux kernel. Since I
don't really know the design/locking philosophy used in the Linux
kernel, I'd be happy if someone else wanted to make a better fix for
this problem.

The problem is with conflicting accesses to the t->sig pointer. The
problem always appears in the send_sig_info() function of
kernel/signal.c --- here's an excerpt of the code:

/* If t->sig is gone, we must be trying to kill the task. So
pretend that it doesn't exist anymore. */
ret = -ESRCH;
if (t->sig == NULL)
goto out_nolock;

/* The somewhat baroque permissions check... */
ret = -EPERM;
if ((!info || ((unsigned long)info != 1 && SI_FROMUSER(info)))
&& ((sig != SIGCONT) || (current->session != t->session))
&& (current->euid ^ t->suid) && (current->euid ^ t->uid)
&& (current->uid ^ t->suid) && (current->uid ^ t->uid)
&& !capable(CAP_SYS_ADMIN))
goto out_nolock;

/* The null signal is a permissions and process existance probe.
No signal is actually delivered. */
ret = 0;
if (!sig)
goto out_nolock;

ka = &t->sig->action[sig-1];

Between the first test of t->sig and the setting of ka, t->sig is
being set to NULL (by another processor in the function
__exit_sighand()). Because of this, ka is invalid, and use of that as
a pointer later in the function causes a kernel page fault.

Clearly the t->sig pointer needs to be locked when it is changed or
when the value must remain consistent throughout a block of code. I
would propose extending the existing t->sigmask_lock to lock this
pointer. I'm not totally sure what other t->sig accesses should be
protected. I've locked a few of those accesses (in the patch below),
but not others. Some of the other modifications to t->sig that I
have not locked are in kernel/fork.c and fs/exec.c, but those look
like brand new processes being initialized (so they wouldn't be
getting any signals yet). So I would think that those particular
accesses don't need to be locked, but someone more familiar with the
Linux kernel should give that some thought.

Anyway, the patch is attached below. This keeps my system stable
under the same conditions that made it reliably crash before. Now I
think I'll go have a beer and feel very pleased with myself for
tracking down that problem... :-)

--

Steve Tate srt@cs.unt.edu

================ Cut Here: Proposed Patch to 2.1.121 =================== diff -u -r linux-2.1.121/include/linux/sched.h linux/include/linux/sched.h --- linux-2.1.121/include/linux/sched.h Fri Sep 11 14:43:31 1998 +++ linux/include/linux/sched.h Fri Sep 11 15:23:46 1998 @@ -298,7 +298,7 @@ /* memory management info */ struct mm_struct *mm; /* signal handlers */ - spinlock_t sigmask_lock; /* Protects signal and blocked */ + spinlock_t sigmask_lock; /* Protects sig, signal, and blocked */ struct signal_struct *sig; sigset_t signal, blocked; struct signal_queue *sigqueue, **sigqueue_tail; diff -u -r linux-2.1.121/kernel/exit.c linux/kernel/exit.c --- linux-2.1.121/kernel/exit.c Sat Aug 15 18:34:52 1998 +++ linux/kernel/exit.c Fri Sep 11 15:05:16 1998 @@ -237,9 +237,12 @@ static inline void __exit_sighand(struct task_struct *tsk) { struct signal_struct * sig = tsk->sig; + unsigned long int flags; if (sig) { + spin_lock_irqsave(&tsk->sigmask_lock, flags); tsk->sig = NULL; + spin_unlock_irqrestore(&tsk->sigmask_lock, flags); if (atomic_dec_and_test(&sig->count)) kfree(sig); } diff -u -r linux-2.1.121/kernel/signal.c linux/kernel/signal.c --- linux-2.1.121/kernel/signal.c Thu Aug 20 16:30:25 1998 +++ linux/kernel/signal.c Fri Sep 11 14:58:36 1998 @@ -228,11 +228,13 @@ if (sig < 0 || sig > _NSIG) goto out_nolock; + spin_lock_irqsave(&t->sigmask_lock, flags); + /* If t->sig is gone, we must be trying to kill the task. So pretend that it doesn't exist anymore. */ ret = -ESRCH; if (t->sig == NULL) - goto out_nolock; + goto out; /* The somewhat baroque permissions check... */ ret = -EPERM; @@ -241,16 +243,15 @@ && (current->euid ^ t->suid) && (current->euid ^ t->uid) && (current->uid ^ t->suid) && (current->uid ^ t->uid) && !capable(CAP_SYS_ADMIN)) - goto out_nolock; + goto out; /* The null signal is a permissions and process existance probe. No signal is actually delivered. */ ret = 0; if (!sig) - goto out_nolock; + goto out; ka = &t->sig->action[sig-1]; - spin_lock_irqsave(&t->sigmask_lock, flags); switch (sig) { case SIGKILL: case SIGCONT: @@ -372,11 +373,16 @@ int force_sig_info(int sig, struct siginfo *info, struct task_struct *t) { + unsigned long int flags; + + spin_lock_irqsave(&t->sigmask_lock, flags); if (t->sig == NULL) return -ESRCH; if (t->sig->action[sig-1].sa.sa_handler == SIG_IGN) t->sig->action[sig-1].sa.sa_handler = SIG_DFL; + + spin_unlock_irqrestore(&t->sigmask_lock, flags); sigdelset(&t->blocked, sig); return send_sig_info(sig, info, t);

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/faq.html