Re: [PATCH] posix-timers: Do not modify an already queued timer signal

From: Oleg Nesterov
Date: Wed Jul 16 2008 - 12:18:21 EST


On 07/16, Mark McLoughlin wrote:
>
> When a timer fires, posix_timer_event() zeroes out its
> pre-allocated siginfo structure, initialises it and then
> queues up the signal with send_sigqueue().
>
> However, we may have previously queued up this signal, in
> which case we only want to increment si_overrun and
> re-initialising the siginfo structure is incorrect.

Quoting Roland McGrath:
>
> I'm not clear on how the already-queued case could ever happen. Do we
> really need that check at all? It shouldn't be possible for the timer to
> be firing when it's already queued, because it won't have been reloaded.
> It only reloads via do_schedule_next_timer after it's dequeued, or because
> a 1 return value said it never was queued.

> Also, since we are modifying an already queued signal
> without the protection of the sighand spinlock, we may also
> race with e.g. collect_signal() causing it to fail to find
> a signal on the pending list because it happens to look at
> the siginfo struct after it was zeroed and before it was
> re-initialised.
>
> The race was observed with a modified kvm-userspace when
> running a guest under heavy network load. When it occurs,
> KVM never sees another SIGALRM signal because although
> the signal is queued up the appropriate bit is never set
> in the pending mask.

Hmm. Yes, if collect_signal() races with posix_timer_event()->memset(),
we dequeue SIGALRM but leave the siginfo on list. I can't see how
this is possible though...

Could you verify that we don't have another bug? Say, add
WARN_ON(!list_empty()) to posix_timer_event().


If we need this fix, perhaps it is better to modify posix_timer_event()
to check !list_empty()?

Add Thomas.

Oleg.

> Manually sending the process a SIGALRM
> kicks it out of this state.
>
> The fix is simple - only modify the pre-allocated sigqueue
> once we're sure that it hasn't already been queued.
>
> Signed-off-by: Mark McLoughlin <markmc@xxxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Roland McGrath <roland@xxxxxxxxxx>
>
> ---
> include/linux/sched.h | 2 +-
> kernel/posix-timers.c | 20 +++++++++++---------
> kernel/signal.c | 5 +++--
> 3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 2134917..718f7ec 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1791,7 +1791,7 @@ extern void zap_other_threads(struct task_struct *p);
> extern int kill_proc(pid_t, int, int);
> extern struct sigqueue *sigqueue_alloc(void);
> extern void sigqueue_free(struct sigqueue *);
> -extern int send_sigqueue(struct sigqueue *, struct task_struct *, int group);
> +extern int send_sigqueue(struct sigqueue *, siginfo_t *, struct task_struct *, int group);
> extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
> extern int do_sigaltstack(const stack_t __user *, stack_t __user *, unsigned long);
>
> diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
> index dbd8398..b42c964 100644
> --- a/kernel/posix-timers.c
> +++ b/kernel/posix-timers.c
> @@ -298,19 +298,21 @@ void do_schedule_next_timer(struct siginfo *info)
>
> int posix_timer_event(struct k_itimer *timr,int si_private)
> {
> - memset(&timr->sigq->info, 0, sizeof(siginfo_t));
> - timr->sigq->info.si_sys_private = si_private;
> + siginfo_t info;
> +
> + memset(&info, 0, sizeof(siginfo_t));
> + info.si_sys_private = si_private;
> /* Send signal to the process that owns this timer.*/
>
> - timr->sigq->info.si_signo = timr->it_sigev_signo;
> - timr->sigq->info.si_errno = 0;
> - timr->sigq->info.si_code = SI_TIMER;
> - timr->sigq->info.si_tid = timr->it_id;
> - timr->sigq->info.si_value = timr->it_sigev_value;
> + info.si_signo = timr->it_sigev_signo;
> + info.si_errno = 0;
> + info.si_code = SI_TIMER;
> + info.si_tid = timr->it_id;
> + info.si_value = timr->it_sigev_value;
>
> if (timr->it_sigev_notify & SIGEV_THREAD_ID) {
> struct task_struct *leader;
> - int ret = send_sigqueue(timr->sigq, timr->it_process, 0);
> + int ret = send_sigqueue(timr->sigq, &info, timr->it_process, 0);
>
> if (likely(ret >= 0))
> return ret;
> @@ -321,7 +323,7 @@ int posix_timer_event(struct k_itimer *timr,int si_private)
> timr->it_process = leader;
> }
>
> - return send_sigqueue(timr->sigq, timr->it_process, 1);
> + return send_sigqueue(timr->sigq, &info, timr->it_process, 1);
> }
> EXPORT_SYMBOL_GPL(posix_timer_event);
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 6c0958e..50e0b13 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1292,9 +1292,9 @@ void sigqueue_free(struct sigqueue *q)
> __sigqueue_free(q);
> }
>
> -int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
> +int send_sigqueue(struct sigqueue *q, siginfo_t *info, struct task_struct *t, int group)
> {
> - int sig = q->info.si_signo;
> + int sig = info->si_signo;
> struct sigpending *pending;
> unsigned long flags;
> int ret;
> @@ -1322,6 +1322,7 @@ int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
>
> signalfd_notify(t, sig);
> pending = group ? &t->signal->shared_pending : &t->pending;
> + copy_siginfo(&q->info, info);
> list_add_tail(&q->list, &pending->list);
> sigaddset(&pending->signal, sig);
> complete_signal(sig, t, group);
> --
> 1.5.5.1
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/