Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)
From: Suparna Bhattacharya
Date: Fri Feb 23 2007 - 08:52:47 EST
On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
> On Wed, Feb 21 2007, Ingo Molnar wrote:
> > this is the v3 release of the syslet/threadlet subsystem:
> >
> > http://redhat.com/~mingo/syslet-patches/
>
> [snip]
>
> Ingo, some testing of the experimental syslet queueing stuff, in the
> syslet-testing branch of fio.
>
> Fio job file:
>
> [global]
> bs=8k
> size=1g
> direct=0
> ioengine=syslet-rw
> iodepth=32
> rw=read
>
> [file]
> filename=/ramfs/testfile
>
> Only changes between runs was changing ioengine and iodepth as indicated
> in the table below.
>
> Results:
>
> Engine Depth Bw (MiB/sec)
> --------------------------------------------
> libaio 1 441
> syslet 1 574
> sync 1 589
> libaio 32 613
> syslet 32 681
>
> Results are stable to within +/- 1MiB/sec. So you can see that syslet
> are still a bit slower than sync for depth 1, but beats the pants off
> libaio for equal depths. Note that this is buffered IO, I'll be out for
> the weekend, but I'll hack some direct IO testing up next week to
> compare "real" queuing.
>
> Just a quick microbenchmark to gauge current overhead...
This is just ramfs, to gauge pure overheads, is that correct ?
BTW, I'm not surprised at Ingo's initial results of syslet vs libaio
overheads, for aio-stress/fio type streaming io runs, because these cases
do not involve large numbers of outstanding ios. So the overhead of
thread creation with syslets is amortized across the entire run of io
submissions because of the reuse of already created async threads. While
in the libaio case there is a setup and teardown of kiocbs per request.
What I have been concerned about instead in the past when considering
thread based AIO implementations is the resource(memory) consumption impact
on overall system performance and adaptability to varying loads. It is nice
that we can avoid that for the cached cases, but for the general blocking
cases, it is still not clear to me whether we have addressed this well
enough yet. I used to think that even the kiocb was too heavyweight for its
purpose ... especially in terms of scaling to larger loads.
As a really crude (and not very realistic) example of the potential impact
of large numbers of outstanding IOs, I tried some quick direct IO comparisons
using fio:
[global]
ioengine=syslet-rw
buffered=0
rw=randread
bs=64k
size=1024m
iodepth=64
Engine Depth Bw (MiB/sec)
libaio 64 17.323
syslet 64 17.524
libaio 20000 15.226
syslet 20000 11.015
Regards
Suparna
--
Suparna Bhattacharya (suparna@xxxxxxxxxx)
Linux Technology Center
IBM Software Lab, India
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/