Re: [PATCH] posix-timers: Do not modify an already queued timer signal

From: Roland McGrath
Date: Sun Jul 20 2008 - 02:52:57 EST

Next message: David Miller: "Re: [RFC] netdev: debugging option"
Previous message: Ingo Molnar: "Re: [PATCH] x86,pci: detect end_bus_number according to acpi/e820reserved"
In reply to: Oleg Nesterov: "Re: [PATCH] posix-timers: Do not modify an already queued timer signal"
Next in thread: Oleg Nesterov: "Re: [PATCH] posix-timers: Do not modify an already queued timer signal"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I think the solution we want is to make sure that a timer whose
expiration signal is still pending never gets a second notification.

POSIX says, "The effect of disarming or resetting a timer with pending
expiration notifications is unspecified." This gives us lots of
latitude. I think the best behavior is the one that comports with the
general specification, "Only a single signal shall be queued to the
process for a given timer at any point in time."

The it_requeue_pending/si_sys_private logic is indeed intended to
address this case. It predates my involvement of the code, and I
agree it has always looked oddly hairy. Its plan is to make sure that
dequeue_signal's call to do_schedule_next_timer does not re-arm the
timer when an intervening timer_settime call has already done so. As
Mark found, this is buggy because it's not really safe for the timer
to go off (call posix_timer_event again) before the dequeue.

In the signals code, si_sys_private is used purely as a flag that
dequeue_signal needs to bother with dropping the siglock to call
do_schedule_next_timer. Its use as a serialization counter between
timer_settime and do_schedule_next_timer is entirely local to the
posix-timers code.

What userland should observe is that when the new timer expiration
time arrives while the previously-fired timer signal is still pending,
there is no new signal, and the timer accrues an overrun.

An obvious idea is simply not to re-arm in timer_settime when the
timer already has a pending notification. When the signal gets
dequeued, do_schedule_next_timer will arm it then. However, the way
we dispatch to different clocks' timer_set functions makes it awkward
to validate and store a new setting and fetch the old value without
actually arming the timer. Also, the logic for one-shot timers has
more wrinkles.

So I think the thing to do is arm the timer in timer_settime even when
it's already pending (as we do now), but have posix_timer_event detect
a firing when already queued and just bump the overrun count.

We can do away with it_requeue_pending, and make si_sys_private purely
a simple 0/1 flag for whether to call do_schedule_next_timer. To
optimize the extra checks, we use different bookkeeping fields in
struct k_itimer.

Have a flag that means "might be queued". do_schedule_next_timer
replaces:

if (timr && timr->it_requeue_pending == info->si_sys_private) {
...
}

with:

if (timr && timr->fired) {
timr->fired = 0;
...
}

That avoids races with timer_settime, which does (after timer_set
call, timer still locked):

if (timr->fired) {
spin_lock(&current->sighand->siglock);
if (list_empty(&timr->sigq->list))
timr->fired = 0;
spin_unlock(&current->sighand->siglock);
}

In posix_timer_event (timer is locked), check:

if (timr->fired) {
spin_lock(&current->sighand->siglock);
if (list_empty(&timr->sigq->list))
timr->fired = 0;
spin_unlock(&current->sighand->siglock);
if (timr->fired) {
timr->it_overrun++;
return 0;
}
}
timr->fired = 1;

The siglock'd double-check is in case of a one-shot firing that didn't
ever call do_schedule_next_timer to reset the flag.

This bookkeeping plan may need more thought. I think the idea is
sound: "firing", meaning notification and reload, only occurs when
both the expiration time has passed and the previous expiration
notification signal has been dequeued. When the expiration time
passes but the other criterion is unmet, there's an overrun and no new
notification or reload.

Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Miller: "Re: [RFC] netdev: debugging option"
Previous message: Ingo Molnar: "Re: [PATCH] x86,pci: detect end_bus_number according to acpi/e820reserved"
In reply to: Oleg Nesterov: "Re: [PATCH] posix-timers: Do not modify an already queued timer signal"
Next in thread: Oleg Nesterov: "Re: [PATCH] posix-timers: Do not modify an already queued timer signal"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]