[PATCH 2.6.18-rc4 02/10] Some documentation for kmemleak

From: Catalin Marinas
Date: Sat Aug 12 2006 - 17:58:10 EST

From: Catalin Marinas <catalin.marinas@xxxxxxx>

Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx>

Documentation/kmemleak.txt | 157 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 157 insertions(+), 0 deletions(-)

diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt
new file mode 100644
index 0000000..4d2f608
--- /dev/null
+++ b/Documentation/kmemleak.txt
@@ -0,0 +1,157 @@
+Kernel Memory Leak Detector
+Kmemleak provides a way of detecting possible kernel memory leaks in a
+way similar to a tracing garbage collector
+with the difference that the orphan pointers are not freed but only
+reported via /sys/kernel/debug/memleak. A similar method is used by
+the Valgrind tool (memcheck --leak-check) to detect the memory leaks
+in user-space applications.
+CONFIG_DEBUG_MEMLEAK has to be enabled. For additional config options,
+look in:
+ -> Kernel hacking
+ -> Kernel debugging
+ -> Debug slab memory allocations
+ -> Kernel memory leak detector
+To display the possible memory leaks:
+ # mount -t debugfs nodev /sys/kernel/debug/
+ # cat /sys/kernel/debug/memleak
+In order to reduce the run-time overhead, memory scanning is only
+performed when reading the /sys/kernel/debug/memleak file.
+Basic Algorithm
+The memory allocations via kmalloc, vmalloc, kmem_cache_alloc and
+friends are tracked and the pointers, together with additional
+information like size and stack trace, are stored in a radix tree. The
+corresponding freeing function calls are tracked and the pointers
+removed from the radix tree.
+An allocated block of memory is considered orphan if a pointer to its
+start address or to an alias (pointer aliases are explained later)
+cannot be found by scanning the memory (including saved
+registers). This means that there might be no way for the kernel to
+pass the address of the allocated block to a freeing function and
+therefore the block is considered a leak.
+The scanning algorithm steps:
+ 1. mark all pointers as white (remaining white pointers will later
+ be considered orphan)
+ 2. scan the memory starting with the data section and stacks,
+ checking the values against the addresses stored in the radix
+ tree. If a white pointer is found, it is added to the grey list
+ 3. scan the grey pointers for matching addresses (some white
+ pointers can become grey and added at the end of the grey list)
+ until the grey set is finished
+ 4. the remaining white pointers are considered orphan and reported
+ via /sys/kernel/debug/memleak
+Because the Linux kernel calculates many pointers at run-time via the
+container_of macro (see the lists implementation), a lot of false
+positives would be reported. This tool re-writes the container_of
+macro so that the offset and type information is stored in the
+.init.memleak_offsets section. The memleak_init() function creates a
+radix tree with corresponding offsets for every encountered block
+type. The memory allocations hook stores the pointer address together
+with its aliases based on the type of the allocated block.
+While one level of offsets should be enough for most cases, a second
+level, i.e. container_of(container_of(...)), can be enabled via the
+configuration options (one false positive is the "struct socket_alloc"
+allocation in the sock_alloc_inode() function).
+Some allocated memory blocks have pointers stored in the kernel's
+internal data structures and they cannot be detected as orphans. To
+avoid this, kmemleak can also store the number of values equal to the
+pointer (or aliases) that need to be found so that the block is not
+considered a leak. One example is __vmalloc().
+Limitations and Drawbacks
+The biggest drawback is the reduced performance of memory allocation
+and freeing. To avoid other penalties, the memory scanning is only
+performed when the /sys/kernel/debug/memleak file is read. Anyway,
+this tool is intended for debugging purposes where the performance
+might not be the most important requirement.
+Kmemleak currently approximates the type id using the sizeof()
+compiler built-in function. This is not accurate and can lead to false
+negatives. The aim is to gradually change the kernel and kmemleak to
+do more precise type identification.
+Another source of false negatives is the data stored in non-pointer
+values. Together with the more precise type identification, kmemleak
+could only scan the pointer members in the allocated structures.
+The tool can report false positives. These are cases where an
+allocated block doesn't need to be freed (some cases in the init_call
+functions), the pointer is calculated by other methods than the
+container_of macro or the pointer is stored in a location not scanned
+by kmemleak. If the "member" argument in the offsetof(type, member)
+call is not constant, kmemleak considers the offset as zero since it
+cannot be determined at compilation time.
+Page allocations and ioremap are not tracked. Only the ARM and i386
+architectures are currently supported.
+Kmemleak API
+See the include/linux/memleak.h header for the functions prototype.
+memleak_init - initialize kmemleak
+memleak_alloc - notify of a memory block allocation
+memleak_free - notify of a memory block freeing
+memleak_padding - mark the boundaries of the data inside the block
+memleak_not_leak - mark a pointer as not a leak
+memleak_ignore - do not scan or report a pointer as leak
+memleak_scan_area - add scan areas inside a memory block
+memleak_insert_aliases - add aliases for a given type
+memleak_erase - erase an old value in a pointer variable
+memleak_typeid_raw - set the typeid for an allocated block
+memleak_container - statically declare a pointer alias
+memleak_typeid - set the typeid for an allocated block (takes
+ a type rather than typeid as argument)
+Dealing with false positives/negatives
+To reduce the false negatives, kmemleak provides the memleak_ignore,
+memleak_scan_area and memleak_erase functions. The task stacks also
+increase the amount of false negatives and their scanning is not
+enabled by default.
+To eliminate the false positives caused by code allocating a different
+size from the object one (either for alignment or for extra memory
+after the end of the structure), kmemleak provides the memleak_padding
+and memleak_typeid functions.
+For pointers known not to be leaks, kmemleak provides the
+memleak_not_leak function. The memleak_ignore could also be used if
+the memory block is known not to contain other pointers as it will no
+longer be scanned.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/