Re: [PATCH 3/7] dlm: recovery

From: David Teigland
Date: Tue Apr 26 2005 - 02:26:48 EST


On Mon, Apr 25, 2005 at 09:42:21AM -0700, Nish Aravamudan wrote:
> On 4/25/05, David Teigland <teigland@xxxxxxxxxx> wrote:

> > +int dlm_wait_function(struct dlm_ls *ls, int (*testfn) (struct dlm_ls *ls))
> > +{
> > + int error = 0;
> > +
> > + for (;;) {
> > + wait_event_interruptible_timeout(ls->ls_wait_general,
> > + testfn(ls) ||
> > + test_bit(LSFL_LS_STOP, &ls->ls_flags),
>
> Shouldn't this be
> dlm_recover_stopped(ls) and not test_bit()?

yes, done

> > + (dlm_config.recover_timer * HZ));
> > + if (testfn(ls))
> > + break;
> > + if (dlm_recovery_stopped(ls)) {
> > + error = -1;
> > + break;
> > + }
> > + }
> > +
> > + return error;
> > +}
>
> Rather than wrap an infinite loop in an infinite loop, since all you
> really want the second loop for is a periodic gurantee of
> recover_timer seconds, wouldn't it be easier to have a sof-timer set
> up to go off in that amount of time, and have it's callback do an
> unconditional wake-up on the local wait-queue then mod_timer the timer
> back in? Or might there be other tasks sleeping on the same
> wait-queue? At that point, since your code does not seem to care about
> signals (uses interruptible version, but doesn't check for
> signal_pending() return value from wait_event), you probably could
> just use the stock version of wait_event(). The conditions would be
> checked in wait_event() on the desired periodic basis, just as they
> are now, but maybe slightly more efficiently (and cleaner code, IMO :)
> ). If you do *need* interruptible, please put in a comment as to why.


Don't need inter, wait_event_timeout should work. Let's see if I followed
all that; are you suggesting:

static void dlm_wait_timer_fn(unsigned long data)
{
struct dlm_ls *ls = (struct dlm_ls *) data;
wake_up(&ls->ls_wait_general);
}

int dlm_wait_function(struct dlm_ls *ls, int (*testfn) (struct dlm_ls *ls))
{
struct timer_list timer;
int error = 0;

init_timer(&timer);
timer.function = dlm_wait_timer_fn;
timer.data = (long) ls;

for (;;) {
mod_timer(&timer, jiffies + (dlm_config.recover_timer * HZ));

wait_event(ls->ls_wait_general,
testfn(ls) || dlm_recovery_stopped(ls));

if (timer_pending(&timer))
del_timer(&timer);

if (testfn(ls))
break;
if (dlm_recovery_stopped(ls)) {
error = -1;
break;
}
}
return error;
}

[Another thread should usually be calling wake_up().]

This is actually how it was some months ago, but I thought the timers made
it more complicated given that wait_event_timeout is available. Do you
really think the timers are nicer if we don't use interruptible?


> > +int dlm_wait_status_low(struct dlm_ls *ls, unsigned int wait_status)
> > +{
> > + struct dlm_rcom *rc = (struct dlm_rcom *) ls->ls_recover_buf;
> > + int error = 0, nodeid = ls->ls_low_nodeid;
> > +
> > + for (;;) {
> > + error = dlm_recovery_stopped(ls);
> > + if (error)
> > + goto out;
> > +
> > + error = dlm_rcom_status(ls, nodeid);
> > + if (error)
> > + break;
> > +
> > + if (rc->rc_result & wait_status)
> > + break;
> > + else {
> > + set_current_state(TASK_INTERRUPTIBLE);
> > + schedule_timeout(HZ >> 1);
>
> 500 ms is a long time! What's the justification? No comments?
> Especially considering you will wake up on *every* signal. It's
> unlikely you'll actually sleep for 500 ms. Also, please use
> msleep_interruptible(), unless you expect to be woken by wait-queues
> (I didn't have enough time to trace all the possible code paths :) )
> -- in which case, make it explicit with comments, please.

This is polling the status of a remote node. All I'm after here is a
delay between the dlm_rcom_status() calls so that repeated status messages
aren't flooding the network. We keep sending status messages until the
remote node reports the status we want to see.

When the remote node has _failed_ we're repeating these status requests
hopelessly until the membership system detects the node failure and
recovery is aborted (dlm_recovery_stopped) -- up to several seconds. When
the remote node is just _slower_ at doing recovery, I'm guessing the
difference is on average a second.

Given that info, do you have a suggested delay? I've already switched to
msleep.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/