Terrible disk performance when files cached > 4GB
From: Colum Paget
Date: Fri Apr 15 2016 - 05:27:30 EST
Hi all,
I suspect that many people will have reported this, but I thought I'd drop you
a line just in case everyone figures someone else has reported it. It's
possible we're just doing something wrong and so encountering this problem,
but I can't find anyone saying they've found a solution, and the problem
doesn't seem to be present in 3.x kernels, which makes us think it could be a
bug.
We are seeing a problem in 4.4.5 and 4.4.6 32-bit 'hugemem' kernels running on
machines with > 4GB ram. The problem results in disk performance dropping
from 120 MB/s to 1MB/s or even less. 3.18.x 32-bit kernels do not seem to
exhibit this behaviour, or at least we can't make it happen reliably. We've
tried 3.14.65 and 3.14.65 and they don't exhibit the same degree of problem.
We've not yet been able to test 64 bit kernels, it will be a while before we
can. We've been able to reproduce the problem on multiple machines with
different hardware configs, and with different kernel configs as regards
SMP , NUMA support and transparent hugepages.
This problem can be reproduced thusly:
Unpack/transfer a *large* number of files onto disk. As they unpack one can
monitor the amount of memory being used for file caching with 'free'. Disk
transfer speeds can be tested by 'dd'-ing a large file locally. Initially the
transfer rate for this file will be over 100GB/s. However, when the amount of
cached memory exceeds some figure (this was 4GB on some systems, 10GB on
others) disk performance will start to dramatically degrade. Very swiftly the
disks become unusable.
On some machines this situation can be recovered by:
echo 3 > /proc/sys/vm/drop_caches
However, we've seen some cases where even this doesn't seem to help, and the
machine has to be rebooted.
We believe the problem is that the memory cache gets so big that searching
through it becomes slower than reading files directly off disk. One problem
with this theory is that we're always copying the same file over and over in
our tests, so the file is unlikely to be a 'cache miss', personally I would
have expected performance to only be bad for cache misses, but it's bad for
everything, so maybe our theory is wrong.
For our purposes, we're fine running with 3.14.x series kernels, but I thought
I should let you know.
regards,
Colum
--
Colum Paget
Axiom Software Engineer
Phone: 01827 61212