Re: [patch/rft] jbd2: tag journal writes as metadata I/O

From: Jeff Moyer
Date: Mon Apr 05 2010 - 11:24:36 EST

Jan Kara <jack@xxxxxxx> writes:

> Hi,
>> In running iozone for writes to small files, we noticed a pretty big
>> discrepency between the performance of the deadline and cfq I/O
>> schedulers. Investigation showed that I/O was being issued from 2
>> different contexts: the iozone process itself, and the jbd2/sdh-8 thread
>> (as expected). Because of the way cfq performs slice idling, the delays
>> introduced between the metadata and data I/Os were significant. For
>> example, cfq would see about 7MB/s versus deadline's 35 for the same
>> workload. I also tested fs_mark with writing and fsyncing 1000 64k
>> files, and a similar 5x performance difference was observed. Eric
>> Sandeen suggested that I flag the journal writes as metadata, and once I
>> did that, the performance difference went away completely (cfq has
>> special logic to prioritize metadata I/O).
>> So, I'm submitting this patch for comments and testing. I have a
>> similar patch for jbd that I will submit if folks agree that this is a
>> good idea.
> This looks like a good idea to me. I'd just be careful about data=journal
> mode where even data is written via journal and thus you'd incorrectly
> prioritize all the IO. I suppose that could have negative impact on performace
> of other filesystems on the same disk. So for data=journal mode, I'd leave
> write_op to be just WRITE / WRITE_SYNC_PLUG.

Hi, Jan, thanks for the review! I'm trying to figure out the best way
to relay the journal mode from ext3 or ext4 to jbd or jbd2. Would a new
journal flag, set in journal_init_inode, be appropriate? This wouldn't
cover the case of data journalling set per inode, though. It also puts
some ext3-specific code into the purportedly fs-agnostic jbd code
(specifically, testing the superblock for the data journal mount flag).
Do you have any suggestions?

