> > where you get maximum concurrency during the pre-synchronization
> > part, and a "chain" of synchronized execution *as part of the same
> > function flow*, but possibly independent of other synchronization
> > flows.
> This too can be implemented using wq directly. More below.

however you are forcing the function to be split in pieces,
which makes for a more complex programming model.
For example, I have trouble proving to myself that your ata conversion
is acutally correct.

> The tradeoff changes with the worker pool implementation can be shared
> with workqueue which provides its own ways to control concurrency and
> synchronize.

while I don't mind sharing the pool implementation (all 20 lines of
it ;-), I don't think the objective of sharing some implementation
detail is worth complicating the programming model.

> Before, the cookie based synchronization is something
> inherent to the async mechanism. The async worker pool was needed and
> the synchronization mechanism came integrated with it. Now that the
> backend can be replaced with workqueue which supplies its own ways of
> synchronization, the cookie based synchronization model needs stronger
> justification as it no longer comes as a integral part of something
> bigger which is needed anyway.

the wq model is either "full async" or "fully ordered".
the cookie mechanism allows for "run async for the expensive bit, and
then INSIDE THE SAME FUNCTION, synchronize, and then run some more".

> If so, we can leave the list based cookie synchronization alone and
> simply use wq's to provide concurrency only without using its
> synchronization mechanisms (flushes).

can you flush from inside a wq element?
That's the critical part that makes the cookie based system easy to

