Re: [patch] readv/writev rework

From: Andrew Morton (akpm@digeo.com)
Date: Thu Sep 12 2002 - 22:31:16 EST


Hirokazu Takahashi wrote:
>
> Hello,
>
> > > Your readv/writev patch interested me and I checked it.
> > > I found we also have a chance to improve normal writev.
> > >
> > > a_ops->prepare_write() and a_ops->commit_write will have a
> > > penalty when I/O size isn't PAGE_SIZE.
> > > With following patch generic_file_write_nolock() will try to
> > > make each I/O size become PAGE_SIZE.
> > >
> >
> > Certainly makes a lot of sense. If an application has a large
> > number of small objects which are to be appended to a file, and
> > they are not contiguous in user memory then this patch makes
> > writev() a very attractive way of doing that. Tons faster.

I wrote a little app which simulates a text editor writing out
its buffer. Just:

struct line {
        char *data;
        int length;
        struct line *next;
};

walk this linked list, writing the lines out. The input was
`cat linux/kernel/*.c > inputfile' and the output was written
1000 times (300 megs). Benched four different ways of writing the
output:

                    2.5.34 2.5.34-mm2 2.5.34-mm2-taka

write 54s 54s 55s
fwrite 12.8s 12.8s 12.7s
fwrite_unlocked 11.6s 11.6s 11.5s
writev 39s 33.4s 15.8s

So Janet's patch made a 15% improvement with this test. Yours
dropped it 50% again.

> Yeah, I realized syslogd is using writev against logfiles which are
> opened with O_SYNC flag! I think heavy loaded mail-servers or
> web-servers may get good performance with the new writev
> as they are logging too much.

O_SYNC writev? Ooh, oww, that hurts...

With 2.5.34, writing the 300k file once (1000x less data than above)
with 1024-vector writev's, opened O_SYNC: 68 seconds.

With 2.5.34-mm2-taka the same write takes 0.23 seconds. (I had to write
100x as much data just to get a measurement).

A 300x speedup is nice, but based on these numbers syslogd should be
using fwrite_unlocked() and fflush().

O_SYNC should be eradicated. It's basically always the wrong thing
to do. Applications should write as much stuff as they can and then
run fsync.

>
> It sounds nice.
> I'll rewrite it soon.
>

Great. The test app is at http://www.zip.com.au/~akpm/writev-speed.c
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:31 EST