[PATCH 0/3] Use per-cpu allocator for !irq requests and prepare for a bulk allocator

From: Mel Gorman
Date: Thu Jan 12 2017 - 05:43:41 EST


Changelog since v2
o Add ack's and benchmark data
o Rebase to 4.10-rc3

Changelog since v1
o Remove a scheduler point from the allocation path
o Finalise the bulk allocator and test it

This series is motivated by a conversation led by Jesper Dangaard Brouer at
the last LSF/MM proposing a generic page pool for DMA-coherent pages. Part
of his motivation was due to the overhead of allocating multiple order-0
that led some drivers to use high-order allocations and splitting them. This
is very slow in some cases.

The first two patches in this series restructure the page allocator such
that it is relatively easy to introduce an order-0 bulk page allocator.
A patch exists to do that and has been handed over to Jesper until an
in-kernel users is created. The third patch alters the per-cpu alloctor
to make it exclusive to !irq requests. This cuts allocation/free overhead
by roughly 30%.

Performance tests from both Jesper and I are included in the patch.

include/linux/gfp.h | 24 ++++
mm/page_alloc.c | 353 +++++++++++++++++++++++++++++++++++++---------------
2 files changed, 278 insertions(+), 99 deletions(-)

--
2.11.0