Submitting large numbers of buffer_heads from b_end_io is _nasty_, 2.4Sounds like I need to move to 2.6. I noticed the elevator is coalescing quite well, and since I am posting mostly continguous runs of sectors, what ends up at the adapter level would probably not change much much between 2.4 and 2.6 since I am maxing out the driver request queues as it is (255 pending requests of 32 scatter/gather elements of 256 sector runs). 2.6 might help but I suspect it will only help alleviate the submission overhead, and not make much difference on performance since the 3Ware card does have an upward limit on outstanding I/O requests.
io scheduler runtime isn't exactly world champion and you are doing this
at hard irq time. Not a good idea. Definitely not the true path to
performance, unless you don't care about anything else in the system.
At least in 2.6 you have a much faster io scheduler and the additionally
large bio, so you wont spend nearly as much time there if you are
clever. You still need process context, though, that hasn't changed.