> Is that also adding 150 usecs to each IO operation?

It is, it's the identical mechanism. SCSI used to do completions via
tasklets, it was converted to softirqs a long time ago but I don't think
anyone ever did timings on it to my knowledge... From the few timings I
showed, 150 usec is a _best_ case time on my hardware. 10 msecs was seen
as well, which is just bad beyond describing.

My suggestion (I'll code this up) is that we scrap the softirq
completion and just do it from the irq event. The typical completion
doesn't even need to grab any locks.

Jens Axboe

