Re: [PATCH v4 2/2] mm/page_ref: add tracepoint to track down page reference manipulation

From: Vlastimil Babka
Date: Wed Mar 02 2016 - 11:58:35 EST


On 02/26/2016 01:58 AM, js1304@xxxxxxxxx wrote:
From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>

CMA allocation should be guaranteed to succeed by definition, but,
unfortunately, it would be failed sometimes. It is hard to track down
the problem, because it is related to page reference manipulation and
we don't have any facility to analyze it.

This patch adds tracepoints to track down page reference manipulation.
With it, we can find exact reason of failure and can fix the problem.
Following is an example of tracepoint output. (note: this example is
stale version that printing flags as the number. Recent version will
print it as human readable string.)

Enabling this feature bloat kernel text 30 KB in my configuration.

text data bss dec hex filename
12127327 2243616 1507328 15878271 f2487f vmlinux_disabled
12157208 2258880 1507328 15923416 f2f8d8 vmlinux_enabled


That's not bad, and it's even configurable. Thanks for taking the extra care about overhead since v1.

Note that, due to header file dependency problem between mm.h and
tracepoint.h, this feature has to open code the static key functions
for tracepoints. Proposed by Steven Rostedt in following link.

https://lkml.org/lkml/2015/12/9/699

v3:
o Add commit description and code comment why this patch open code
the static key functions for tracepoints.
o Notify that example is stale version.
o Add "depends on TRACEPOINTS".

v2:
o Use static key of each tracepoints to avoid function call overhead
when tracepoints are disabled.
o Print human-readable page flag thanks to newly introduced %pgp option.
o Add more description to Kconfig.debug.

Acked-by: Michal Nazarewicz <mina86@xxxxxxxxxx>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

+config DEBUG_PAGE_REF
+ bool "Enable tracepoint to track down page reference manipulation"
+ depends on DEBUG_KERNEL
+ depends on TRACEPOINTS
+ ---help---
+ This is the feature to add tracepoint for tracking down page reference
+ manipulation. This tracking is useful to diagnosis functional failure
+ due to migration failure caused by page reference mismatch. Be

OK.

+ careful to turn on this feature because it could bloat some kernel
+ text. In my configuration, it bloats 30 KB. Although kernel text will
+ be bloated, there would be no runtime performance overhead if
+ tracepoint isn't enabled thanks to jump label.

I would just write something like:

Enabling this feature adds about 30 KB to the kernel code, but runtime performance overhead is virtually none until the tracepoints are actually enabled.