Well, this one is quite a pain in the ass.
I'm not very fond of ->rq_select_barrier() approach for the following reasons.
* That removes possibility of correct synchronization. With blk_queue_ordered() approach, we can later add blk_queue_[un]lock_ordered() to achieve correct synchronization if that becomes necessary, but with ->rq_select_barrier() approach, the low-level driver ends up having less control over what's gonna happen when.
* Changing ordered mode is not supposed to be a frequent operation and the blk_queue_ordered() interface makes that explicit.
So, I added ide_driver_t->protocol_changed() callback which gets called whenever dma/multimode changes occur. Unfortunately, dma/multmode changes can be committed with or without context, and with or without queue lock. As blk_queue_ordered uses the queue lock for synchronization, this becomes issue.
I tried to distinguish places where the changes occur while queue lock is held from the other. Not only was it highly error-prone, it couldn't be done without modifying/auditing all low-level drivers as some drivers (cs5520) use the same function which touches dma setting (cs5520_tune_chipset) from both ->speedproc (called with queuelock) and ->ide_dma_check (called without queuelock).
One alternative I'm thinking of is using a workqueue to call blk_queue_ordered, such that we don't have to guess whether or not we're called with queuelock held. Unfortunately, this will give us a small window where wrong barrier requests can hit the drive.
Bartlomiej, any ideas?
Jens, as this one seems to need some time to settle, I'm gonna post updated patchset for post-2.6.15 without ide-fua patch, so that the other stuff can be pushed into -mm. I think we can live without ide-fua for a while. :-)
Thanks.