On Wed, Oct 23, 2013 at 05:01:32PM -0400, kosaki.motohiro@xxxxxxxxx wrote:From: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
Yasuaki Ithimatsu reported memory hot-add spent more than 5 _hours_
on 9TB memory machine and we found out setup_zone_migrate_reserve
spnet >90% time.
The problem is, setup_zone_migrate_reserve scan all pageblock
unconditionally, but it is only necessary number of reserved block
was reduced (i.e. memory hot remove).
Moreover, maximum MIGRATE_RESERVE per zone are currently 2. It mean,
number of reserved pageblock are almost always unchanged.
This patch adds zone->nr_migrate_reserve_block to maintain number
of MIGRATE_RESERVE pageblock and it reduce an overhead of
setup_zone_migrate_reserve dramatically.
It seems regrettable to expand the size of struct zone just for this.
You are right that the number of blocks does not exceed 2 because of a
check made in setup_zone_migrate_reserve so it should be possible to
special case this. I didn't test this or think about it particularly
carefully and no doubt there is a nicer way but for illustration
purposes see the patch below.