Re: stat benchmark
From: Carl Henrik Lunde
Date: Thu Apr 24 2008 - 17:42:19 EST
On Thu, Apr 24, 2008 at 10:59 PM, Soeren Sandmann <sandmann@xxxxxxxxxxx> wrote:
[ about programs reading all inodes after readdir ]
> Unfortunately, performance of that operation kinda sucks. On my system
> (ext3), it produces:
>
> c-24-61-65-93:~% sudo ./a.out
> Time to readdir(): 0.307671 s
> Time to stat 2349 files: 8.203693 s
>
> 8 seconds is about 80 times slower than what a user perceives as
> "instantly" and slow enough that we really should display a progress
> bar if it can't be fixed.
>
> So I am looking for ways to improve this.
>
> Under the theory that disk seeks are killing us, one idea is to add a
> 'multistat' system call that would allow statting of many files at a
> time, which would give the disk scheduler more to work with.
I have experimented with the same problem, and another idea is to
reorder the result from readdir, which I've gotten good results by doing.
This works because:
- For most filesystems there is a high correlation between the inode
number and the sector on the disk.
- Most programs like your example handle the files in the order that
they are returned from readdir
- The time spent sorting is very small compared to the disk seeks
There are several possible ways to implement this:
- reorder the dirents in the kernel for each getdents call
- reorderi the dirents in user space, for example by running
qsort in a libc wrapper
- in the file system, optimize the order before writing back a dirty directory
This does not only apply to programs only stating files, but also reading
them, such as indexing files, backups (tar), and Nautilus getting thumbnails
from JPGs.
--
Carl Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/