Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump

From: HATAYAMA Daisuke
Date: Wed Nov 06 2013 - 19:55:56 EST


(2013/11/06 11:21), Atsushi Kumagai wrote:
(2013/11/06 5:27), Vivek Goyal wrote:
On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote:
This patch set intend to exclude unnecessary hugepages from vmcore dump file.

This patch requires the kernel patch to export necessary data structures into
vmcore: "kexec: export hugepage data structure into vmcoreinfo"
http://lists.infradead.org/pipermail/kexec/2013-November/009997.html

This patch introduce two new dump levels 32 and 64 to exclude all unused and
active hugepages. The level to exclude all unnecessary pages will be 127 now.

Interesting. Why hugepages should be treated any differentely than normal
pages?

If user asked to filter out free page, then it should be filtered and
it should not matter whether it is a huge page or not?

I'm making a RFC patch of hugepages filtering based on such policy.

I attach the prototype version.
It's able to filter out also THPs, and suitable for cyclic processing
because it depends on mem_map and looking up it can be divided into
cycles. This is the same idea as page_is_buddy().

So I think it's better.


@@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map,
&& !isAnon(mapping)) {
if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
pfn_cache_private++;
+ /*
+ * NOTE: If THP for cache is introduced, the check for
+ * compound pages is needed here.
+ */
}
/*
* Exclude the data page of the user process.
*/
- else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
- && isAnon(mapping)) {
- if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
- pfn_user++;
+ else if (info->dump_level & DL_EXCLUDE_USER_DATA) {
+ /*
+ * Exclude the anonnymous pages as user pages.
+ */
+ if (isAnon(mapping)) {
+ if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
+ pfn_user++;
+
+ /*
+ * Check the compound page
+ */
+ if (page_is_hugepage(flags) && compound_order > 0) {
+ int i, nr_pages = 1 << compound_order;
+
+ for (i = 1; i < nr_pages; ++i) {
+ if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
+ pfn_user++;
+ }
+ pfn += nr_pages - 2;
+ mem_map += (nr_pages - 1) * SIZE(page);
+ }
+ }
+ /*
+ * Exclude the hugetlbfs pages as user pages.
+ */
+ else if (hugetlb_dtor == SYMBOL(free_huge_page)) {
+ int i, nr_pages = 1 << compound_order;
+
+ for (i = 0; i < nr_pages; ++i) {
+ if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
+ pfn_user++;
+ }
+ pfn += nr_pages - 1;
+ mem_map += (nr_pages - 1) * SIZE(page);
+ }
}
/*
* Exclude the hwpoison page.

I'm concerned about the case that filtering is not performed to part of mem_map
entries not belonging to the current cyclic range.

If maximum value of compound_order is larger than maximum value of
CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by ARRAY_LENGTH(zone.free_area),
it's necessary to align info->bufsize_cyclic with larger one in
check_cyclic_buffer_overrun().

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/