Re: [PATCH] add unschedule_delayed_work to the workqueue API

From: James Bottomley
Date: Mon Oct 18 2004 - 18:25:51 EST


On Mon, 2004-10-18 at 17:57, Andrew Morton wrote:
> James Bottomley <James.Bottomley@xxxxxxxxxxxx> wrote:
> You mean the scsi code wants to schedule a work some time in the future and
> that when it executes, the handler will then run for several seconds?
>
> If so, it sounds like this is the wrong mechanism to use. We certainly
> don't want keventd gone to lunch for that long, and even if the kernel
> threads were privately created by the scsi code, you'll have to create many
> such threads to get any sort of parallelism across many disks. I suspect
> I'm missing something.

There is a case where this can happen, yes, but I admit it's not the
common one...most DV's are kicked off without a thread from user context
(either in the initial scan or by user request). The common case where
we use a workqueue is if an asynchronous event that alters the transport
agreement comes in (mainly a bus reset) via an interrupt.

> > However, now there's a worse problem. If I want to cancel a piece of
> > work synchronously, flush_scheduled_work() makes me dependent on the
> > execution of all the prior pieces of work. If some of them are related,
> > like Domain Validation and device unlocking say, I have to now be
> > extremely careful that the place I cancel and flush from isn't likely to
> > deadlock with any other work scheduled on the device. This makes it a
> > hard to use interface.
>
> Well yes, the caller wouldn't want to be holding any locks which cause
> currently-queued works to block.

It's not just locks (hopefully there shouldn't be any of these). It's
also device states.

For instance, DV will attempt to quiesce the device (i.e. wait for it's
currently outstanding commands to complete). Thus, I can't use the
cancel/flush method in any code path (such as may be called from error
handling) which could itself be part of the completion path.

This dependence is caused by having to wait for everything that could
potentially be on the workqueue to complete. I'd really prefer an API
that didn't have this potentially nasty entanglement, which is why I
coded mine to only do the wait if the piece of work being removed was in
the process of executing, but not otherwise.

> > By contrast, the proposed patch will *only* wait
> > if the item of work is currently executing. This is a (OK reasonably
> > given the aic del_timer_sync() issues) well understood deadlock
> > problem---the main point being I now don't have to consider any of the
> > other work that might be queued.
>
> OK.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/