Re: [ext3] kjournald writing after each read despite noatime,commit=nnn

From: Bart Samwel
Date: Thu Jan 01 2009 - 14:00:17 EST


Hi Dave,

Dave Johnson wrote:
> Bart Samwel writes:
>> This is the defined behaviour for laptop_mode. Whenever a *physical*
>> READ takes place, this is taken to indicate that the disk is spun up at
>> that time. The laptop_mode functionality then takes that opportunity to
>> sync any dirty data to disk, two seconds (or whatever value you put in
>> /proc/sys/vm/laptop_mode) after the physical disk activity has ceased.
>> The rationale behind this is that you want to sync your stuff when the
>> disk is spun up, and then you want to hold back writing back stuff for a
>> very long while. And the only way it can detect that the disk is spun up
>> is when there is physical disk activity.
>>
>> This is exactly what happens in your case. The READ activity reported by
>> block_dump is *physical* read activity: some data was needed that was
>> not cached in memory. block_dump does not show you what data was
>> retrieved from the ext3 fs *without* having to access the disk, it only
>> shows actual physical disk I/O.
>
> Yep sounds good, but this happens even if there is no dirty data
> needing a sync back to disk.
>
> $ grep 'Dirty\|Write' /proc/meminfo
> Dirty: 0 kB
> Writeback: 0 kB
> WritebackTmp: 0 kB
> $ cat /some/uncached/file >/dev/null
>
> Jan 1 11:43:49 gw kernel: cat(6615): READ block 864408 on hda1
> Jan 1 11:43:51 gw kernel: kjournald(760): WRITE block 2376 on hda1

This looks like it's a generic property of syncing an ext3 file system.
Try turning off laptop_mode and then running "sync". You will probably
see the same behaviour.

> Note, the reason I ask is this is a SSD so just because a physical
> read has taken place recently unneeded writes should be avoided.
>
> Turning laptop_mode to 0, but leaving other settings the same
> resolves the uneeded write:

For your SSD I guess you need to get rid of the
sync-after-disk-activity, but keep the other VM behaviours of
laptop_mode (such as avoiding swapping out pages / writing back dirty
pages in order to free memory as long as it is also possible to just
drop pages that are not dirty).
You can probably achieve this by:
- having a large commit interval etc., like you have now
- setting laptop_mode to a very large value, e.g. a couple of hours.
That will trigger a sync if and only if there has been *no* disk
activity at all for hours on end -- i.e., pretty much never. And the
other write-reducing VM features of laptop_mode will still be enabled.

It would perhaps be a good thing to split these mechanisms into separate
knobs. Write batching (the sync-after-disk-activity stuff and also the
dirty_ratio / dirty_background_ratio changes) are a completely separate
mechanism from write avoidance (the other mechanism I mentioned).

Cheers,
Bart
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/