Re: [PATCH RFC] vfs: add a O_NOMTIME flag

From: Austin S Hemmelgarn
Date: Tue May 12 2015 - 07:46:07 EST


On 2015-05-12 01:08, Kevin Easton wrote:
On Mon, May 11, 2015 at 07:10:21PM -0400, Theodore Ts'o wrote:
On Mon, May 11, 2015 at 09:24:09AM -0700, Sage Weil wrote:
Let me re-ask the question that I asked last week (and was apparently
ignored). Why not trying to use the lazytime feature instead of
pointing a head straight at the application's --- and system
administrators' --- heads?

Sorry Ted, I thought I responded already.

The goal is to avoid inode writeout entirely when we can, and
as I understand it lazytime will still force writeout before the inode
is dropped from the cache. In systems like Ceph in particular, the
IOs can be spread across lots of files, so simply deferring writeout
doesn't always help.

Sure, but it would reduce the writeout by orders of magnitude. I can
understand if you want to reduce it further, but it might be good
enough for your purposes.

I considered doing the equivalent of O_NOMTIME for our purposes at
$WORK, and our use case is actually not that different from Ceph's
(i.e., using a local disk file system to support a cluster file
system), and lazytime was (a) something I figured was something I
could upstream in good conscience, and (b) was more than good enough
for us.

A safer alternative might be a chattr file attribute that if set, the
mtime is not updated on writes, and stat() on the file always shows the
mtime as "right now". At least that way, the file won't accidentally
get left out of backups that rely on the mtime.

(If the file attribute is unset, you immediately update the mtime then
too, and from then on the file is back to normal).

I like this even better than the flag suggestion, it provides better control, means that you don't need to update applications to get the benefits, and prevents backup software from breaking (although backups would be bigger).


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature