Let's assume the guest has virtio (I agree with IDE we needAre you suggesting that the model with cache=writeback gives us the
reordering on the host). The guest sends batches of I/O separated
by cache flushes. If the batches are smaller than the virtio queue
length, ideally things look like:
io_submit(..., batch_size_1);
io_getevents(..., batch_size_1);
fdatasync();
io_submit(..., batch_size_2);
io_getevents(..., batch_size_2);
fdatasync();
io_submit(..., batch_size_3);
io_getevents(..., batch_size_3);
fdatasync();
(certainly that won't happen today, but it could in principle).
How does a write cache give any advantage? The host kernel sees
_exactly_ the same information as it would from a bunch of threaded
pwritev()s followed by fdatasync().
same I/O pattern as cache=none, so there are no opportunities for
optimization?
(wish: IO_CMD_ORDERED_FDATASYNC)I assume you are talking of dedicated disk partitions and not
If the batch size is larger than the virtio queue size, or if there
are no flushes at all, then yes the huge write cache gives more
opportunity for reordering. But we're already talking hundreds of
requests here.
Let's say the virtio queue size was unlimited. What
merging/reordering opportunity are we missing on the host? Again we
have exactly the same information: either the pagecache lru + radix
tree that identifies all dirty pages in disk order, or the block
queue with pending requests that contains exactly the same
information.
Something is wrong. Maybe it's my understanding, but on the other
hand it may be a piece of kernel code.
individual disk images residing on the same partition.