Part of the patch is based on Dave's previous post.
This patch submits I/O to fs via kernel aio, and we
can obtain following benefits:
- double cache in both loop file system and backend file
gets avoided
- context switch decreased a lot, and finally CPU utilization
is decreased
- cached memory got decreased a lot
One main side effect is that throughput is decreased when
accessing raw loop block(not by filesystem) with kernel aio.
This patch has passed xfstests test(./check -g auto), and
both test and scratch devices are loop block, file system is ext4.
Follows two fio tests' result:
1. fio test inside ext4 file system over loop block
1) How to run
- linux kernel base: 3.19.0-rc3-next-20150108(loop-mq merged)
- loop over SSD image 1 in ext4
- linux psync, 16 jobs, size 200M, ext4 over loop block
- test result: IOPS from fio output
2) Throughput result:
-------------------------------------------------------------
test cases |randread |read |randwrite |write |
-------------------------------------------------------------
base |16799 |59508 |31059 |58829
-------------------------------------------------------------
base+kernel aio |15480 |64453 |30187 |57222
-------------------------------------------------------------