[RequestForTesters] SystemTap-based memory allocation failure injection

From: Tetsuo Handa
Date: Thu Dec 25 2014 - 08:10:04 EST


Since it has been an unwritten rule that GFP_KERNEL allocations for
low-order (<=PAGE_ALLOC_COSTLY_ORDER) never fail unless chosen by
the OOM killer, there are a lot of code where allocation failure error
paths are hardly tested.

This is an update of memory allocation failure injection tester
which I used at http://marc.info/?l=linux-fsdevel&m=137104297905865&w=2 .

Since SystemTap can generate backtraces without garbage lines,
we can uniquely identify and inject only once per each backtrace,
making it possible to test every memory allocation callers.

Steps for installation and testing are described below.

---------- installation start ----------
wget http://sourceware.org/systemtap/ftp/releases/systemtap-2.6.tar.gz
echo '65e6745f0ec103758c711dd1d12fb6bf systemtap-2.6.tar.gz' | md5sum --check -
tar -zxf systemtap-2.6.tar.gz
cd systemtap-2.6
./configure --prefix=$HOME/systemtap.tmp
make -s
make -s install
---------- installation end ----------

---------- preparation (optional) start ----------
Start kdump service and set 1 to /proc/sys/vm/panic_on_oops as root user
so that we can obtain vmcore upon kernel oops.
---------- preparation (optional) end ----------

---------- testing start ----------
Run

$HOME/systemtap.tmp/bin/staprun fault_injection.ko

and operate as you like, and see whether your system can survive or not.
---------- testing end ----------

The fault_injection.ko is generated by commands shown below.
Scripts shown below checks only sleepable allocations. If you
replace %{ __GFP_WAIT %} with 0, you can check atomic allocations.

---------- For testing __kmalloc() failure ----------
$HOME/systemtap.tmp/bin/stap -p4 -m fault_injection -g -DSTP_NO_OVERLOAD -e '
global traces_bt[65536];
probe begin { printf("Probe start!\n"); }
probe kernel.function("__kmalloc") {
if (($flags & %{ __GFP_NOFAIL | __GFP_WAIT %} ) == %{ __GFP_WAIT %} && execname() != "stapio") {
bt = backtrace();
if (traces_bt[bt]++ == 0) {
print_stack(bt);
printf("\n\n");
$size = 1 << 30;
}
}
}
probe end { delete traces_bt; }'
---------- For testing __kmalloc() failure ----------

Below one might be too aggressive because it triggers OOM killer
when page fault handler fails.

---------- For testing __alloc_pages_nodemask() failure ----------
$HOME/systemtap.tmp/bin/stap -p4 -m fault_injection -g -DSTP_NO_OVERLOAD -e '
global traces_bt[65536];
probe begin { printf("Probe start!\n"); }
probe kernel.function("__alloc_pages_nodemask") {
if (($gfp_mask & %{ __GFP_NOFAIL | __GFP_WAIT %} ) == %{ __GFP_WAIT %} && execname() != "stapio") {
bt = backtrace();
if (traces_bt[bt]++ == 0) {
print_stack(bt);
printf("\n\n");
$order = 1 << 30;
$gfp_mask = $gfp_mask | %{ __GFP_NORETRY %};
}
}
}
probe end { delete traces_bt; }'
---------- For testing __alloc_pages_nodemask() failure ----------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/