Re: [performance bug] kernel building regression on 64 LCPUsmachine

From: Jan Kara
Date: Wed Jan 19 2011 - 07:56:39 EST


On Wed 19-01-11 10:03:26, Shaohua Li wrote:
> add Jan and Theodore to the loop.
Thanks.

> On Wed, 2011-01-19 at 09:55 +0800, Shi, Alex wrote:
> > Shaohua and I tested kernel building performance on latest kernel. and
> > found it is drop about 15% on our 64 LCPUs NHM-EX machine on ext4 file
> > system. We find this performance dropping is due to commit
> > 749ef9f8423054e326f. If we revert this patch or just change the
> > WRITE_SYNC back to WRITE in jbd2/commit.c file. the performance can be
> > recovered.
> >
> > iostat report show with the commit, read request merge number increased
> > and write request merge dropped. The total request size increased and
> > queue length dropped. So we tested another patch: only change WRITE_SYNC
> > to WRITE_SYNC_PLUG in jbd2/commit.c, but nothing effected.
> since WRITE_SYNC_PLUG doesn't work, this isn't a simple no-write-merge issue.
>
> > we didn't test deadline IO mode, just test cfq. seems insert write
> > request into sync queue effect much read performance, but we don't know
> > details. What's your comments of this?
Indeed it seems that the optimization of the case where we wait for the
transaction is negatively impacting the performance when we are not. Does
patch below help for your load? It refines the logic when WRITE_SYNC
is needed (of course, we should also test whether the patch works for fsync
heavy loads as well).
The patch is mostly a proof of concept and only lightly tested so be
careful...

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
---