RFC: Micro-optimize direct IO submission path

From: Andi Kleen
Date: Wed Jun 22 2011 - 15:20:37 EST


Inspired by the recent fast path DIO patch from Daniel
Ehrenberg.

I spent some time micro optimizing the "slow" direct IO submission
part. This should get rid of large parts of the memset and dio
access costs that Dan noticed.

I moved everything that isn't needed in the completion handler
back into the stack, to make it more likely it's cache hot.

It also inlines everything to allow the compiler to optimize more.
In particular it can split up the sdio structure into individual
variables now and then get rid of unnecessary initializations.
This costs some text size, but I think it's worth for such a hot
path.

And the dio is a slab now, which avoids some fast path overhead.

Dan, could you please test this patch in your test case, comparing against
the fast path again?

Please test with CONFIG_CC_OPTIMIZE_FOR_SIZE and CONFIG_OPTIMIZE_INLINING
both disabled.

Thanks,

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/