I initially observed this between kernels 3.2 and 3.5: on 3.2, copying a
180M shared object on the same ext4 filesystem takes 0.6s. On 3.5, it
takes between two and three minutes. It looks like a similar throughput
regression happens on any machine running an i386 PAE kernel with high
amounts of memory; the threshold seems to be 16G; passing mem=15G to the
kernel commandline fixes it.
I bisected it to the following change:
commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d
Author: Johannes Weiner <jweiner@xxxxxxxxxx>
Date: Tue Jan 10 15:07:42 2012 -0800
mm: exclude reserved pages from dirtyable memory
I realize running x86 kernels against high amounts of memory is not
advised for various reasons, but I would assume that such a big
regression in basic functionality to not be part of them. Is that
accurate, or are these configurations expected to become unusable from
3.3 onwards?
Also CCing Sonny since it looks like he tried to fix an overflow issue
related to the same change with commit c8b74c2f66049, but I'm still
experiencing the problem with a kernel built from master.
Thanks,
- Pierre-Loup