Re: VFS scalability

From: Nick Piggin
Date: Mon Jun 27 2005 - 02:37:07 EST


Andi Kleen wrote:
Nick Piggin <nickpiggin@xxxxxxxxxxxx> writes:


This is with the filesystem mounted as noatime, so I can't work
out why update_atime is so high on the list. I suspect maybe a
false sharing issue with some other fields.


Did all the 64CPUs write to the same file?


Yes.

Then update_atime was just the messenger - it is the first function
to read the inode so it eats the cache miss overhead.


I agree.

Maybe adding a prefetch for it at the beginning of sys_read() might help, but then with 64CPUs writing to parts of the inode
it will always thrash no matter how many prefetches.


True. I'm just not sure what is causing the bouncing - I guess
->f_count due to get_file()?

rw_verify_area is another that is taking a lot of hits - probably
due to the same cacheline(s) as update_atime.

Unless I'm mistaken, the big difference between the read fault and
the read(2) cases is that mmap holds a reference on the file, while
open(2) doesn't?

I guess if anyone really cares about that, they could hack up a flag
to tell the file to remain pinned.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/