On Thu, Nov 27, 2008 at 11:03:40AM -0800, Mike Waychison wrote:Nick Piggin wrote:On Thu, Nov 27, 2008 at 01:28:41AM -0800, Mike Waychison wrote:What do you mean?Török however identified mmap taking on the order of several milliseconds due to this exact problem:Turns out to be a different problem.
http://lkml.org/lkml/2008/9/12/185
His is just contending on the write side. The retry patch doesn't help.
Yes, we've been slowly rolling out fadvise hints out, though not to prefetch, and definitely not for faulting. I don't see how issuing a prefetch right before we try to fault in a page is going to help matters. The pages may appear in pagecache, but they won't be uptodate by the time we look at them anyway, so we're back to square one.We generally try to avoid such things, but sometimes it a) can't be easily avoided (third party libraries for instance) and b) when it hits us, it affects the overall health of the machine/cluster (the monitoring daemons get blocked, which isn't very healthy).Are you doing appropriate posix_fadvise to prefetch in the files before
faulting, and madvise hints if appropriate?
The whole point of a prefetch is to issue it sufficiently early so
it makes a difference. Actually if you can tell quite well where the
major faults will be, but don't know it sufficiently in advance to
do very good prefetching, then perhaps we could add a new madvise hint
to synchronously bring the page in (dropping the mmap_sem over the IO).