Re: ext3 journal commit performance

From: Andrew Morton
Date: Wed Mar 02 2005 - 13:06:06 EST


Muthian Sivathanu <muthian_s@xxxxxxxxx> wrote:
>
> Hi,
>
> I have a question on ext3 journal commit code. When a
> transaction is committed in the ordered mode, ext3
> first issues the data writes, waits for them to
> finish, then issues the journal writes, waits for them
> to finish, and then writes out the commit record.
>
> It appears that the first wait (for the data blocks)
> is unnecessary because all that is required is that
> before the commit, both the data and the metadata
> blocks should be on disk. This extra wait can
> potentially reduce performance in cases where the
> journal is on a separate disk, because you lose
> parallelism between data writes and the metadata
> writes.

1) write the data
2) wait on the data write
3) write the journal
4) wait on the journal write

If we were to omit step 2), we wouldn't be ordering data any more: we will
commit the journal while there are still data writes in flight.

However, what you are proposing is, I think,

1) write the data
2) write the journal
3) wait on the data write
4) wait on the journal write

That would work, and could possibly speed things up a little.

But bear in mind that the journal write is just a single seek, and the
journal tends to be at one end of the disk, and we need to seek to it
anyway. There would be some opportunity for the elevator and the disk to
optimise away a seek.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/