Re: [PATCH] account_group_exec_runtime: fix the racy usage of->signal

From: Oleg Nesterov
Date: Fri Nov 07 2008 - 12:41:48 EST


On 11/07, Doug Chapman wrote:
>
> On Fri, 2008-11-07 at 17:21 +0100, Ingo Molnar wrote:
> > * Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > > @@ -351,10 +351,12 @@ static inline void account_group_exec_ru
> > > unsigned long long ns)
> > > {
> > > struct signal_struct *sig;
> > > + unsigned long flags;
> > >
> > > - sig = tsk->signal;
> > > - if (unlikely(!sig))
> > > + if (unlikely(!lock_task_sighand(tsk, &flags)))
> > > return;
> >
> > i think this will lock up: the signal lock must not nest inside the rq
> > lock, and these accounting functions are called from within the
> > scheduler.
>
> I can confirm that this does hang on bootup.

Thanks a lot Doug.

If only I could understand what happens. I am running the 2.6.27 kernel
with the patch below just fine.

Ingo, could you please explain?

OK, perhaps we can check ->exit_state... I'll return on Monday.


--- linux-2.6.27/kernel/sched_fair.c~DBG 2008-10-10 00:13:53.000000000 +0200
+++ linux-2.6.27/kernel/sched_fair.c 2008-11-07 19:15:28.000000000 +0100
@@ -484,6 +484,16 @@ __update_curr(struct cfs_rq *cfs_rq, str
curr->vruntime += delta_exec_weighted;
}

+static void ttt(struct task_struct *tsk)
+{
+ unsigned long flags;
+
+ if (unlikely(!lock_task_sighand(tsk, &flags)))
+ return;
+
+ unlock_task_sighand(tsk, &flags);
+}
+
static void update_curr(struct cfs_rq *cfs_rq)
{
struct sched_entity *curr = cfs_rq->curr;
@@ -507,6 +517,7 @@ static void update_curr(struct cfs_rq *c
struct task_struct *curtask = task_of(curr);

cpuacct_charge(curtask, delta_exec);
+ ttt(curtask);
}
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/