Re: Recursive directory accounting for size, ctime, etc.

From: John Stoffel
Date: Fri Aug 08 2008 - 09:11:37 EST


>>>>> "Pavel" == Pavel Machek <pavel@xxxxxxx> writes:

Pavel> On Tue 2008-07-15 11:28:22, Sage Weil wrote:
>> All-
>>
>> Ceph is a new distributed file system for Linux designed for scalability
>> (terabytes to exabytes, tens to thousands of storage nodes), reliability,
>> and performance. The latest release (v0.3), aside from xattr support and
>> the usual slew of bugfixes, includes a unique (?) recursive accounting
>> infrastructure that allows statistics about all metadata nested beneath a
>> point in the directory hierarchy to be efficiently propagated up the tree.
>> Currently this includes a file and directory count, total bytes (summation
>> over file sizes), and most recent inode ctime. For example, for a
>> directory like /home, Ceph can efficiently report the total number of
>> files, directories, and bytes contained by that entire subtree of the
>> directory hierarchy.
>>
>> The file size summation is the most interesting, as it effectively gives
>> you directory-based quota space accounting with fine granularity. In many
>> deployments, the quota _accounting_ is more important than actual
>> enforcement. Anybody who has had to figure out what has filled/is filling
>> up a large volume will appreciate how cumbersome and inefficient 'du' can
>> be for that purpose--especially when you're in a hurry.
>>
>> There are currently two ways to access the recursive stats via a standard
>> shell. The first simply sets the directory st_size value to the
>> _recursive_ bytes ('rbytes') value (when the client is mounted with -o
>> rbytes). For example (watch the directory sizes),
Pavel> ...

>> Naturally, there are a few caveats:
>>
>> - There is some built-in delay before statistics fully propagate up
>> toward the root of the hierarchy. Changes are propagated
>> opportunistically when lock/lease state allows, with an upper bound of (by
>> default) ~30 seconds for each level of directory nesting.

Pavel> Having instant rctime would be very nice -- for stuff like locate and
Pavel> speeding up kde startup.

>> I'm extremely interested in what people think of overloading the file
>> system interface in this way. Handy? Crufty? Dangerous? Does anybody

Pavel> Too ugly to live.

Pavel> What about new rstat() syscall?

Or how about tying this into the quotactl() syscall and extending it a
bit? Say quotactl2(cmd,device,id,addr,path) which is probably just as
ugly, but seems to make better sense.

Me, I'd love to have this type of reporting on my filesystems, esp
since it would help me in my day job.

How exports over NFS would look is an issue too.

John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/