Re: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

From: David Chinner
Date: Thu Apr 19 2007 - 10:47:18 EST


On Thu, Apr 19, 2007 at 08:54:04AM +0200, Jarek Poplawski wrote:
> Hi,
>
> IMHO cancel_rearming_delayed_work is dangerous place:

Agreed - I spent a couple of hours today learning why it
can only be used on work functions that always rearm...

> - it assumes a work function always rearms (with no exception),
> which probably isn't explained enough now (but anyway should
> be checked in such loops);
>
> - probably possible (theoretical) scenario: a few work
> functions rearm themselves with very short, equal times;
> before flush_workqueue ends, their timers are already
> fired, so cancel_delayed_work has nothing to do.

Easier than that - have a work function that rearms only if there's
more work to do in the future. You only arm the timer when you
have work to do, and it only rearms if there's more work to
do in the future (e.g. rotating expiry lists).

i.e. while there's more work to do, you need to call
cancel_rearming_delayed_work() to stop it reliably, but if you race
with the work function not restarting itself, you hang.....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/