Re: [PATCH RFC] vfs: add a O_NOMTIME flag

From: John Stoffel
Date: Tue May 12 2015 - 09:55:14 EST


>>>>> "Austin" == Austin S Hemmelgarn <ahferroin7@xxxxxxxxx> writes:

Austin> On 2015-05-12 01:08, Kevin Easton wrote:
>> On Mon, May 11, 2015 at 07:10:21PM -0400, Theodore Ts'o wrote:
>>> On Mon, May 11, 2015 at 09:24:09AM -0700, Sage Weil wrote:
>>>>> Let me re-ask the question that I asked last week (and was apparently
>>>>> ignored). Why not trying to use the lazytime feature instead of
>>>>> pointing a head straight at the application's --- and system
>>>>> administrators' --- heads?
>>>>
>>>> Sorry Ted, I thought I responded already.
>>>>
>>>> The goal is to avoid inode writeout entirely when we can, and
>>>> as I understand it lazytime will still force writeout before the inode
>>>> is dropped from the cache. In systems like Ceph in particular, the
>>>> IOs can be spread across lots of files, so simply deferring writeout
>>>> doesn't always help.
>>>
>>> Sure, but it would reduce the writeout by orders of magnitude. I can
>>> understand if you want to reduce it further, but it might be good
>>> enough for your purposes.
>>>
>>> I considered doing the equivalent of O_NOMTIME for our purposes at
>>> $WORK, and our use case is actually not that different from Ceph's
>>> (i.e., using a local disk file system to support a cluster file
>>> system), and lazytime was (a) something I figured was something I
>>> could upstream in good conscience, and (b) was more than good enough
>>> for us.
>>
>> A safer alternative might be a chattr file attribute that if set, the
>> mtime is not updated on writes, and stat() on the file always shows the
>> mtime as "right now". At least that way, the file won't accidentally
>> get left out of backups that rely on the mtime.
>>
>> (If the file attribute is unset, you immediately update the mtime then
>> too, and from then on the file is back to normal).
>>

Austin> I like this even better than the flag suggestion, it provides
Austin> better control, means that you don't need to update
Austin> applications to get the benefits, and prevents backup software
Austin> from breaking (although backups would be bigger).

Me too, it fails in a safer mode, where you do more work on backups
than strictly needed. I'm still against this as a mount option
though, way way way too many bullets in the foot gun. And as someone
else said, once you mount with O_NOMTIME, then unmount, then mount
again without O_NOMTIME, you've lost information. Not good.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/