[RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator
From: Robin Holt
Date: Thu Jul 11 2013 - 22:05:11 EST
We have been working on this since we returned from shutdown and have
something to discuss now. We restricted ourselves to 2MiB initialization
to keep the patch set a little smaller and more clear.
First, I think I want to propose getting rid of the page flag. If I knew
of a concrete way to determine that the page has not been initialized,
this patch series would look different. If there is no definitive
way to determine that the struct page has been initialized aside from
checking the entire page struct is zero, then I think I would suggest
we change the page flag to indicate the page has been initialized.
The heart of the problem as I see it comes from expand(). We nearly
always see a first reference to a struct page which is in the middle
of the 2MiB region. Due to that access, the unlikely() check that was
originally proposed really ends up referencing a different page entirely.
We actually did not introduce an unlikely and refactor the patches to
make that unlikely inside a static inline function. Also, given the
strong warning at the head of expand(), we did not feel experienced
enough to refactor it to make things always reference the 2MiB page
first.
With this patch, we did boot a 16TiB machine. Without the patches,
the v3.10 kernel with the same configuration took 407 seconds for
free_all_bootmem. With the patches and operating on 2MiB pages instead
of 1GiB, it took 26 seconds so performance was improved. I have no feel
for how the 1GiB chunk size will perform.
I am on vacation for the next three days so I am sorry in advance for
my infrequent or non-existant responses.
Signed-off-by: Robin Holt <holt@xxxxxxx>
Signed-off-by: Nate Zimmer <nzimmer@xxxxxxx>
To: "H. Peter Anvin" <hpa@xxxxxxxxx>
To: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Linux Kernel <linux-kernel@xxxxxxxxxxxxxxx>
Cc: Linux MM <linux-mm@xxxxxxxxx>
Cc: Rob Landley <rob@xxxxxxxxxxx>
Cc: Mike Travis <travis@xxxxxxx>
Cc: Daniel J Blueman <daniel@xxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/