Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

From: Avi Kivity
Date: Tue Mar 16 2010 - 12:01:37 EST


On 03/16/2010 04:27 PM, Balbir Singh wrote:

Let's assume the guest has virtio (I agree with IDE we need
reordering on the host). The guest sends batches of I/O separated
by cache flushes. If the batches are smaller than the virtio queue
length, ideally things look like:

io_submit(..., batch_size_1);
io_getevents(..., batch_size_1);
fdatasync();
io_submit(..., batch_size_2);
io_getevents(..., batch_size_2);
fdatasync();
io_submit(..., batch_size_3);
io_getevents(..., batch_size_3);
fdatasync();

(certainly that won't happen today, but it could in principle).

How does a write cache give any advantage? The host kernel sees
_exactly_ the same information as it would from a bunch of threaded
pwritev()s followed by fdatasync().

Are you suggesting that the model with cache=writeback gives us the
same I/O pattern as cache=none, so there are no opportunities for
optimization?

Yes. The guest also has a large cache with the same optimization algorithm.


(wish: IO_CMD_ORDERED_FDATASYNC)

If the batch size is larger than the virtio queue size, or if there
are no flushes at all, then yes the huge write cache gives more
opportunity for reordering. But we're already talking hundreds of
requests here.

Let's say the virtio queue size was unlimited. What
merging/reordering opportunity are we missing on the host? Again we
have exactly the same information: either the pagecache lru + radix
tree that identifies all dirty pages in disk order, or the block
queue with pending requests that contains exactly the same
information.

Something is wrong. Maybe it's my understanding, but on the other
hand it may be a piece of kernel code.

I assume you are talking of dedicated disk partitions and not
individual disk images residing on the same partition.

Correct. Images in files introduce new writes which can be optimized.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/