Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops

From: Ingo Molnar
Date: Mon Sep 13 2004 - 04:26:42 EST



* Kirill Korotaev <dev@xxxxx> wrote:

> >the BUG() is useful for all the code that uses next_thread() - you can
> >only do a safe next_thread() iteration if you've locked ->sighand.

> 1. I don't see spin_lock() on p->sighand->siglock in do_task_stat()
> before calling next_thread(). And the check inside next_thread() permits
> only one of the locks to be taken:
>
> if (!spin_is_locked(&p->sighand->siglock) &&
> !rwlock_is_locked(&tasklist_lock))
>
> which is probably wrong, since tasklist_lock is always required!

It's not 'wrong' in terms of correctness it's simply too restrictive for
no reason. I agree that we should check for the tasklist lock only.

> 2. I think the idea of checking sighand is quite obscure. Probably it
> would be better to call pid_alive() for check at such places in proc,
> isn't it?

yeah, it's just as good of a check.

> 3. And yes, now I agree that this check and BUG() prevented
> next_thread() from walking through the deleted list_head in
> tsk->pid_list.

good.

> But I would propose to reorganize these checks in next_thread() to
> something like this:
>
> if (!rwlock_is_locked(&tasklist_lock) || p->pids[PIDTYPE_TGID].nr == 0)
> BUG();
>
> the last check ensures that we are still hashed and this check is more
> straithforward for understanding, agree?

yep - please send a new patch to Andrew.

> 4. If we do checks this way, then we can found strange proc numeric
> results. Suppose, you have read the do_task_stat() and it iterated
> through the threads and summed the times in this loop with
> next_thread(). Next, the thread dies and you can read the results w/o
> this loop and threads times, only this thread stats. Looks a bit
> invalid. Don't you think so? Maybe we should return an error?

i'd just skip filling out that statistics field - like my minimal patch
does.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/