Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

From: Anton Vorontsov
Date: Mon May 07 2007 - 07:56:02 EST


On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote:
> On 05/07, Jarek Poplawski wrote:
> >
> > There is a lot of new things in the final version of this
> > patch. I guess, there was no such problem in the previous
> > version.
>
> No, this is basically the same patch + re-check-cwq-after-lock,
> the latter is mostly needed to prevent racing with CPU-hotplug.
>
> > I can also see you have new doubts about usefulness, which
> > I cannot understand:
> > - even if there are some slowdowns, where does it matter?
> > - the "old" method uses only one method of cancelling, i.e.
> > del_timer, not trying to stop requeuing or to remove from
> > the queue; it seems to be effective only with long delayed
> > timers, and its real problems are probably mostly invisible.
>
> The slowdown is small, changelog mentions it just to be "fair".
>
> I am not happy with the complication this patch adds, mostly
> I hate this smb_wmb() in insert_work(). I have an idea how to
> remove it later, but this needs another patch not related to
> workqueue.c.
>
> > BTW, I'm still not convinced all additions are needed:
> > the "old" cancel_rearming_ doesn't care about checking
> > or waiting on anything after del_timer positive.
>
> It would be very strange to do wait_on_work() only in case
> when del_timer() failed. This way we still need to do
> cancel_work_sync() after cancel_rearming_delayed_work(),
> but only when del_timer() failed, ugly. Note also that
> wait_on_work() does not sleep if work->func() is not running.
>
> Also, consider this callback:
>
> void work_handler(struct work_struct *w)
> {
> struct delayed_work dw = container_of(...);
>
> queue_delayed_work(dw, delay);
>
> // <------------- cancel_rearming_delayed_work()
>
> cancel_delayed_work(dw);
> queue_delayed_work(dw, another_delay);
> }
>
> Yes, this is strange and ugly. But correct! The current version

I guess pseudo code below is not that strange, but real usecase:

probe()
{
INIT_DELAYED_WORK(...);
/* we're not issuing queue_delayed_work() in probe(), work will
* be started by interrupt */
return;
}

remove()
{
/* hang will happen here if there was no queue_delyed_work()
* call (like there was no interrupts, which starts rearming
* work */
cancel_rearming_delayed_work();
}


Your patch will fix it, right?

> Oleg.

Good luck,

--
Anton Vorontsov
email: cbou@xxxxxxx
backup email: ya-cbou@xxxxxxxxx
irc://irc.freenode.org/bd2
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/