Re: [PATCH] sigqueue_free: fix the race with collect_signal()

From: taoyue
Date: Sun Aug 26 2007 - 21:59:25 EST


Oleg Nesterov wrote:
On 08/24, Sukadev Bhattiprolu wrote:
Oleg Nesterov wrote:
On 08/24, taoyue wrote:
Oleg Nesterov wrote:
collect_signal: sigqueue_free:

list_del_init(&first->list);
spin_lock_irqsave(lock, flags);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

if (!list_empty(&q->list))
list_del_init(&q->list);
spin_unlock_irqrestore(lock, flags);
q->flags &= ~SIGQUEUE_PREALLOC;

__sigqueue_free(first); __sigqueue_free(q);
collect_signal() is always called under ->siglock which is also taken by
sigqueue_free(), so this is not possible.


I know, using current->sighand->siglock to prevent one sigqueue
is free twice. I want to know whether it is possible that the two
function is called in different thread. If that, the spin_lock is useless.
Not sure I understand. Yes, it is possible they are called by 2 different
threads, that is why we had a race. But all threads in the same thread
group have the same ->sighand, and thus the same ->sighand->siglock.
I has applied the new patch, and start test program last Friday.
So far, the test program has run for three days.
In my test program, all threads is created in one process, so they
are in the same thread group. If two thread is not in the same thread
group, they have the different ->sighand->siglock. In this case, the
lock can't prevent the sigqueue object from race condition.
Oleg, if one thread can be in collect_signal() and another in sigqueue_free() and both operate on the exact same sigqueue object, its not clear how we prevent two calls to __sigqueue_free() to
the same object. In that case the lock (or some lock) should be around __sigqueue_free() - no ?

i.e if we enter sigqueue_free(), we will call __sigqueue_free() regardless of the state.

Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free()
checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free().

IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used
by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC
and free sigqueue. We don't need this lock around sigqueue_free() to prevent
the race. collect_signal() can "see" only those sigqueues which are on list.

IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because
it needs the same lock. Now we delete this sigqueue from list, nobody can
see it, it can't have other references. So we can unlock ->siglock, mark
sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it.

Do you agree?

Oleg.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/