[PATCH v2] kernel: add kcov code coverage

From: Dmitry Vyukov
Date: Wed Jan 13 2016 - 07:48:50 EST


kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing). Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system. A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/). However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible. It aims
to collect more or less stable coverage that is function of syscall
inputs. To achieve this goal it does not collect coverage in
soft/hard interrupts and instrumentation of some inherently
non-deterministic or non-interesting parts of kernel is disbled
(e.g. scheduler, locking).

Currently there is a single coverage collection mode (tracing),
but the API anticipates additional collection modes.
Initially I also implemented a second mode which exposes
coverage in a fixed-size hash table of counters (what Quentin
used in his original patch). I've dropped the second mode for
simplicity.

This patch adds the necessary support on kernel side.
The complimentary compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer,
which has found 90 kernel bugs in just 2 months:
https://github.com/google/syzkaller/wiki/Found-Bugs
We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation". For example, mounting
a random blob as a filesystem, or receiving a random blob over wire.

Why not gcov. Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.
A typical coverage can be just a dozen of basic blocks (e.g. an
invalid input). In such context gcov becomes prohibitively expensive
as reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M). Cost of
kcov depends only on number of executed basic blocks/edges. On top of
that, kernel requires per-thread coverage because there are
always background threads and unrelated processes that also produce
coverage. With inlined gcov instrumentation per-thread coverage is not
possible.

kcov exposes kernel PCs and control flow to user-space which
is insecure. But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.
Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
---
Anticipating reasonable questions regarding usage of this feature.
Quentin Casasnovas and Vegard Nossum also plan to use kcov for
coverage-guided fuzzing. Currently they use a custom kernel patch
for their fuzzer and found several dozens of bugs.
There is also interest from Intel 0-DAY kernel test infrastructure.

Based on commit 67990608c8b95d2b8ccc29932376ae73d5818727.

v2: - added note to commit desciption that kcov is insecure,
but debugfs should not be mapped as user accessible.
- make CONFIG_KCOV depend on CONFIG_ARCH_HAS_KCOV
instead of conditional inclusion with if/endif
(as per Kees comments).
---
Documentation/kcov.txt | 111 +++++++++++++++
Makefile | 10 +-
arch/x86/Kconfig | 1 +
arch/x86/boot/Makefile | 6 +
arch/x86/boot/compressed/Makefile | 2 +
arch/x86/entry/vdso/Makefile | 2 +
arch/x86/kernel/Makefile | 5 +
arch/x86/kernel/apic/Makefile | 4 +
arch/x86/kernel/cpu/Makefile | 4 +
arch/x86/lib/Makefile | 3 +
arch/x86/mm/Makefile | 3 +
arch/x86/realmode/rm/Makefile | 2 +
include/linux/kcov.h | 19 +++
include/linux/sched.h | 10 ++
include/uapi/linux/kcov.h | 10 ++
kernel/Makefile | 9 ++
kernel/exit.c | 2 +
kernel/fork.c | 3 +
kernel/kcov/Makefile | 5 +
kernel/kcov/kcov.c | 287 ++++++++++++++++++++++++++++++++++++++
kernel/locking/Makefile | 3 +
kernel/rcu/Makefile | 4 +
kernel/sched/Makefile | 4 +
lib/Kconfig.debug | 22 +++
lib/Makefile | 9 ++
mm/Makefile | 15 ++
mm/kasan/Makefile | 1 +
scripts/Makefile.lib | 6 +
28 files changed, 561 insertions(+), 1 deletion(-)
create mode 100644 Documentation/kcov.txt
create mode 100644 include/linux/kcov.h
create mode 100644 include/uapi/linux/kcov.h
create mode 100644 kernel/kcov/Makefile
create mode 100644 kernel/kcov/kcov.c

diff --git a/Documentation/kcov.txt b/Documentation/kcov.txt
new file mode 100644
index 0000000..1fa6a3d
--- /dev/null
+++ b/Documentation/kcov.txt
@@ -0,0 +1,111 @@
+kcov: code coverage for fuzzing
+===============================
+
+kcov exposes kernel code coverage information in a form suitable for coverage-
+guided fuzzing (randomized testing). Coverage data of a running kernel is
+exported via the "kcov" debugfs file. Coverage collection is enabled on a task
+basis, and thus it can capture precise coverage of a single system call.
+
+Note that kcov does not aim to collect as much coverage as possible. It aims
+to collect more or less stable coverage that is function of syscall inputs.
+To achieve this goal it does not collect coverage in soft/hard interrupts
+and instrumentation of some inherently non-deterministic parts of kernel is
+disbled (e.g. scheduler, locking).
+
+Usage:
+======
+
+Configure kernel with:
+
+ CONFIG_KCOV=y
+ CONFIG_DEBUG_FS=y
+
+CONFIG_KCOV requires gcc built on revision 231296 or later.
+Profiling data will only become accessible once debugfs has been mounted:
+
+ mount -t debugfs none /sys/kernel/debug
+
+The following program demonstrates kcov usage from within a test program:
+
+#include <stdio.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+
+#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
+#define KCOV_ENABLE _IO('c', 100)
+#define KCOV_DISABLE _IO('c', 101)
+#define COVER_SIZE (64<<10)
+
+int main(int argc, char **argv)
+{
+ int fd;
+ uint32_t *cover, n, i;
+
+ /* A single fd descriptor allows coverage collection on a single
+ * thread.
+ */
+ fd = open("/sys/kernel/debug/kcov", O_RDWR);
+ if (fd == -1)
+ perror("open");
+ /* Setup trace mode and trace size. */
+ if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
+ perror("ioctl");
+ /* Mmap buffer shared between kernel- and user-space. */
+ cover = (uint32_t*)mmap(NULL, COVER_SIZE * sizeof(uint32_t),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if ((void*)cover == MAP_FAILED)
+ perror("mmap");
+ /* Enable coverage collection on the current thread. */
+ if (ioctl(fd, KCOV_ENABLE, 0))
+ perror("ioctl");
+ /* Reset coverage from the tail of the ioctl() call. */
+ __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
+ /* That's the target syscal call. */
+ read(-1, NULL, 0);
+ /* Read number of PCs collected. */
+ n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
+ /* PCs are shorten to uint32_t, so we need to restore the upper part. */
+ for (i = 0; i < n; i++)
+ printf("0xffffffff%0lx\n", (unsigned long)cover[i + 1]);
+ /* Disable coverage collection for the current thread. After this call
+ * coverage can be enabled for a different thread.
+ */
+ if (ioctl(fd, KCOV_DISABLE, 0))
+ perror("ioctl");
+ /* Free resources. */
+ if (munmap(cover, COVER_SIZE * sizeof(uint32_t)))
+ perror("munmap");
+ if (close(fd))
+ perror("close");
+ return 0;
+}
+
+After piping through addr2line output of the program looks as follows:
+
+SyS_read
+fs/read_write.c:562
+__fdget_pos
+fs/file.c:774
+__fget_light
+fs/file.c:746
+__fget_light
+fs/file.c:750
+__fget_light
+fs/file.c:760
+__fdget_pos
+fs/file.c:784
+SyS_read
+fs/read_write.c:562
+
+If a program needs to collect coverage from several threads (independently),
+it needs to open /sys/kernel/debug/kcov in each thread separately.
+
+The interface is fine-grained to allow efficient forking of test processes.
+That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode,
+mmaps coverage buffer and then forks child processes in a loop. Child processes
+only need to enable coverage (disable happens automatically on thread end).
diff --git a/Makefile b/Makefile
index 70dea02..9fe404a 100644
--- a/Makefile
+++ b/Makefile
@@ -365,6 +365,7 @@ LDFLAGS_MODULE =
CFLAGS_KERNEL =
AFLAGS_KERNEL =
CFLAGS_GCOV = -fprofile-arcs -ftest-coverage
+CFLAGS_KCOV = -fsanitize-coverage=trace-pc


# Use USERINCLUDE when you must reference the UAPI directories only.
@@ -411,7 +412,7 @@ export MAKE AWK GENKSYMS INSTALLKERNEL PERL PYTHON UTS_MACHINE
export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS

export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
-export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KASAN
+export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KCOV CFLAGS_KASAN
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
@@ -667,6 +668,13 @@ endif
endif
KBUILD_CFLAGS += $(stackp-flag)

+ifdef CONFIG_KCOV
+ ifeq ($(call cc-option, $(CFLAGS_KCOV)),)
+ $(warning Cannot use CONFIG_KCOV: \
+ -fsanitize-coverage=trace-pc is not supported by compiler)
+ endif
+endif
+
ifeq ($(cc-name),clang)
KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
KBUILD_CPPFLAGS += $(call cc-option,-Wno-unknown-warning-option,)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ccfeded..40a9563 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_GCOV_PROFILE_ALL
+ select ARCH_HAS_KCOV if X86_64
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_MMIO_FLUSH
select ARCH_HAS_SG_CHAIN
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index 2ee62db..b2eb295 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -10,6 +10,12 @@
#

KASAN_SANITIZE := n
+# Kernel does not boot with kcov instrumentation here.
+# One of the problems observed was insertion of __sanitizer_cov_trace_pc()
+# callback into middle of per-cpu data enabling code. Thus the callback observed
+# inconsistent state and crashed. We are interested mostly in syscall coverage,
+# so boot code is not interesting anyway.
+KCOV_INSTRUMENT := n

# If you want to preset the SVGA mode, uncomment the next line and
# set SVGA_MODE to whatever number you want.
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 0a291cd..e625939 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -17,6 +17,8 @@
# compressed vmlinux.bin.all + u32 size of vmlinux.bin.all

KASAN_SANITIZE := n
+# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
+KCOV_INSTRUMENT := n

targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 265c0ed..1b663b8 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -4,6 +4,8 @@

KBUILD_CFLAGS += $(DISABLE_LTO)
KASAN_SANITIZE := n
+# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
+KCOV_INSTRUMENT := n

VDSO64-$(CONFIG_X86_64) := y
VDSOX32-$(CONFIG_X86_X32_ABI) := y
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1b78ff..4648960 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -19,6 +19,11 @@ endif
KASAN_SANITIZE_head$(BITS).o := n
KASAN_SANITIZE_dumpstack.o := n
KASAN_SANITIZE_dumpstack_$(BITS).o := n
+# If instrumentation of this dir is enabled, boot hangs during first second.
+# Probably could be more selective here, but note that files related to irqs,
+# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
+# non-deterministic coverage.
+KCOV_INSTRUMENT := n

CFLAGS_irq.o := -I$(src)/../include/asm/trace

diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index 8bb12dd..8f2a3d7 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -2,6 +2,10 @@
# Makefile for local APIC drivers and for the IO-APIC code
#

+# Leads to non-deterministic coverage that is not a function of syscall inputs.
+# In particualr, smp_apic_timer_interrupt() is called in random places.
+KCOV_INSTRUMENT := n
+
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o ipi.o vector.o
obj-y += hw_nmi.o

diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 5803130..c108683 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -8,6 +8,10 @@ CFLAGS_REMOVE_common.o = -pg
CFLAGS_REMOVE_perf_event.o = -pg
endif

+# If these files are instrumented, boot hangs during the first second.
+KCOV_INSTRUMENT_common.o := n
+KCOV_INSTRUMENT_perf_event.o := n
+
# Make sure load_percpu_segment has no stackprotector
nostackp := $(call cc-option, -fno-stack-protector)
CFLAGS_common.o := $(nostackp)
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index a501fa2..fefca94 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -2,6 +2,9 @@
# Makefile for x86 specific library files.
#

+# Produces uninteresting flaky coverage.
+KCOV_INSTRUMENT_delay.o := n
+
inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
quiet_cmd_inat_tables = GEN $@
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index f9d38a4..147def6 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -1,3 +1,6 @@
+# Kernel does not boot with instrumentation of tlb.c.
+KCOV_INSTRUMENT_tlb.o := n
+
obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \
pat.o pgtable.o physaddr.o gup.o setup_nx.o

diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
index 2730d77..2abf667 100644
--- a/arch/x86/realmode/rm/Makefile
+++ b/arch/x86/realmode/rm/Makefile
@@ -7,6 +7,8 @@
#
#
KASAN_SANITIZE := n
+# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
+KCOV_INSTRUMENT := n

always := realmode.bin realmode.relocs

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
new file mode 100644
index 0000000..72ff663
--- /dev/null
+++ b/include/linux/kcov.h
@@ -0,0 +1,19 @@
+#ifndef _LINUX_KCOV_H
+#define _LINUX_KCOV_H
+
+#include <uapi/linux/kcov.h>
+
+struct task_struct;
+
+#ifdef CONFIG_KCOV
+
+void kcov_task_init(struct task_struct *t);
+void kcov_task_exit(struct task_struct *t);
+
+#else
+
+static inline void kcov_task_init(struct task_struct *t) {}
+static inline void kcov_task_exit(struct task_struct *t) {}
+
+#endif /* CONFIG_KCOV */
+#endif /* _LINUX_KCOV_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 61aa9bb..9e9e9f6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1807,6 +1807,16 @@ struct task_struct {
/* bitmask and counter of trace recursion */
unsigned long trace_recursion;
#endif /* CONFIG_TRACING */
+#ifdef CONFIG_KCOV
+ /* Coverage collection mode enabled for this task (0 if disabled). */
+ int kcov_mode;
+ /* Size of the kcov_area. */
+ unsigned long kcov_size;
+ /* Buffer for coverage collection. */
+ void *kcov_area;
+ /* kcov desciptor wired with this task or NULL. */
+ void *kcov;
+#endif
#ifdef CONFIG_MEMCG
struct mem_cgroup *memcg_in_oom;
gfp_t memcg_oom_gfp_mask;
diff --git a/include/uapi/linux/kcov.h b/include/uapi/linux/kcov.h
new file mode 100644
index 0000000..574e22e
--- /dev/null
+++ b/include/uapi/linux/kcov.h
@@ -0,0 +1,10 @@
+#ifndef _LINUX_KCOV_IOCTLS_H
+#define _LINUX_KCOV_IOCTLS_H
+
+#include <linux/types.h>
+
+#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
+#define KCOV_ENABLE _IO('c', 100)
+#define KCOV_DISABLE _IO('c', 101)
+
+#endif /* _LINUX_KCOV_IOCTLS_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 53abf00..db7278b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -19,6 +19,14 @@ CFLAGS_REMOVE_cgroup-debug.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_irq_work.o = $(CC_FLAGS_FTRACE)
endif

+# Prevents flicker of uninteresting __do_softirq()/__local_bh_disable_ip()
+# in coverage traces.
+KCOV_INSTRUMENT_softirq.o := n
+# These are called from save_stack_trace() on slub debug path,
+# and produce insane amounts of uninteresting coverage.
+KCOV_INSTRUMENT_module.o := n
+KCOV_INSTRUMENT_extable.o := n
+
# cond_syscall is currently not LTO compatible
CFLAGS_sys_ni.o = $(DISABLE_LTO)

@@ -69,6 +77,7 @@ obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o audit_fsnotify.o
obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
obj-$(CONFIG_GCOV_KERNEL) += gcov/
+obj-$(CONFIG_KCOV) += kcov/
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_KGDB) += debug/
obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
diff --git a/kernel/exit.c b/kernel/exit.c
index 07110c6..49a1339 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -53,6 +53,7 @@
#include <linux/oom.h>
#include <linux/writeback.h>
#include <linux/shm.h>
+#include <linux/kcov.h>

#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -657,6 +658,7 @@ void do_exit(long code)
TASKS_RCU(int tasks_rcu_i);

profile_task_exit(tsk);
+ kcov_task_exit(tsk);

WARN_ON(blk_needs_flush_plug(tsk));

diff --git a/kernel/fork.c b/kernel/fork.c
index 6774e6b..7fa2ccd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -75,6 +75,7 @@
#include <linux/aio.h>
#include <linux/compiler.h>
#include <linux/sysctl.h>
+#include <linux/kcov.h>

#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -384,6 +385,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)

account_kernel_stack(ti, 1);

+ kcov_task_init(tsk);
+
return tsk;

free_ti:
diff --git a/kernel/kcov/Makefile b/kernel/kcov/Makefile
new file mode 100644
index 0000000..88892b7
--- /dev/null
+++ b/kernel/kcov/Makefile
@@ -0,0 +1,5 @@
+KCOV_INSTRUMENT := n
+KASAN_SANITIZE := n
+
+obj-y := kcov.o
+
diff --git a/kernel/kcov/kcov.c b/kernel/kcov/kcov.c
new file mode 100644
index 0000000..05ec361
--- /dev/null
+++ b/kernel/kcov/kcov.c
@@ -0,0 +1,287 @@
+#define pr_fmt(fmt) "kcov: " fmt
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/printk.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/vmalloc.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+#include <linux/kcov.h>
+
+enum kcov_mode {
+ /* Tracing coverage collection mode.
+ * Covered PCs are collected in a per-task buffer.
+ */
+ kcov_mode_trace = 1,
+};
+
+/* kcov descriptor (one per opened debugfs file). */
+struct kcov {
+ /* Reference counter. We keep one for:
+ * - opened file descriptor
+ * - mmapped region (including copies after fork)
+ * - task with enabled coverage (we can't unwire it from another task)
+ */
+ atomic_t rc;
+ /* The lock protects state transitions of the descriptor:
+ * - initial state after open()
+ * - then there must be a single ioctl(KCOV_INIT_TRACE) call
+ * - then, mmap() call (several calls are allowed but not useful)
+ * - then, repeated enable/disable for a task (only one task a time
+ * allowed
+ */
+ spinlock_t lock;
+ enum kcov_mode mode;
+ unsigned long size;
+ void *area;
+ struct task_struct *t;
+};
+
+/* Entry point from instrumented code.
+ * This is called once per basic-block/edge.
+ */
+void __sanitizer_cov_trace_pc(void)
+{
+ struct task_struct *t;
+ enum kcov_mode mode;
+
+ t = current;
+ /* We are interested in code coverage as a function of a syscall inputs,
+ * so we ignore code executed in interrupts.
+ */
+ if (!t || in_interrupt())
+ return;
+ mode = READ_ONCE(t->kcov_mode);
+ if (mode == kcov_mode_trace) {
+ u32 *area;
+ u32 pos;
+
+ /* There is some code that runs in interrupts but for which
+ * in_interrupt() returns false (e.g. preempt_schedule_irq()).
+ * READ_ONCE()/barrier() effectively provides load-acquire wrt
+ * interrupts, there are paired barrier()/WRITE_ONCE() in
+ * kcov_ioctl_locked().
+ */
+ barrier();
+ area = t->kcov_area;
+ /* The first u32 is number of subsequent PCs. */
+ pos = READ_ONCE(area[0]) + 1;
+ if (likely(pos < t->kcov_size)) {
+ area[pos] = (u32)_RET_IP_;
+ WRITE_ONCE(area[0], pos);
+ }
+ }
+}
+EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
+
+static void kcov_put(struct kcov *kcov)
+{
+ if (atomic_dec_and_test(&kcov->rc)) {
+ vfree(kcov->area);
+ kfree(kcov);
+ }
+}
+
+void kcov_task_init(struct task_struct *t)
+{
+ t->kcov_mode = 0;
+ t->kcov_size = 0;
+ t->kcov_area = NULL;
+ t->kcov = NULL;
+}
+
+void kcov_task_exit(struct task_struct *t)
+{
+ struct kcov *kcov;
+
+ kcov = t->kcov;
+ if (kcov == NULL)
+ return;
+ spin_lock(&kcov->lock);
+ BUG_ON(kcov->t != t);
+ /* Just to not leave dangling references behind. */
+ kcov_task_init(t);
+ kcov->t = NULL;
+ spin_unlock(&kcov->lock);
+ kcov_put(kcov);
+}
+
+static int kcov_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+ struct kcov *kcov;
+ unsigned long off;
+ struct page *page;
+
+ /* Map the preallocated kcov->area. */
+ kcov = vma->vm_file->private_data;
+ off = vmf->pgoff << PAGE_SHIFT;
+ if (off >= kcov->size * sizeof(u32))
+ return -1;
+
+ page = vmalloc_to_page(kcov->area + off);
+ get_page(page);
+ vmf->page = page;
+ return 0;
+}
+
+static void kcov_unmap(struct vm_area_struct *vma)
+{
+ kcov_put(vma->vm_file->private_data);
+}
+
+static void kcov_map_copied(struct vm_area_struct *vma)
+{
+ struct kcov *kcov;
+
+ kcov = vma->vm_file->private_data;
+ atomic_inc(&kcov->rc);
+}
+
+static const struct vm_operations_struct kcov_vm_ops = {
+ .fault = kcov_vm_fault,
+ .close = kcov_unmap,
+ /* Called on fork()/clone() when the mapping is copied. */
+ .open = kcov_map_copied,
+};
+
+static int kcov_mmap(struct file *filep, struct vm_area_struct *vma)
+{
+ int res = 0;
+ void *area;
+ struct kcov *kcov = vma->vm_file->private_data;
+
+ /* Can't call vmalloc_user() under a spinlock. */
+ area = vmalloc_user(vma->vm_end - vma->vm_start);
+ if (!area)
+ return -ENOMEM;
+
+ spin_lock(&kcov->lock);
+ if (kcov->mode == 0 || vma->vm_pgoff != 0 ||
+ vma->vm_end - vma->vm_start != kcov->size * sizeof(u32)) {
+ res = -EINVAL;
+ goto exit;
+ }
+ if (!kcov->area) {
+ kcov->area = area;
+ area = NULL;
+ }
+ /* The file drops a reference on close, but the file
+ * descriptor can be closed with the mmaping still alive so we keep
+ * a reference for those. This is put in kcov_unmap().
+ */
+ atomic_inc(&kcov->rc);
+ vma->vm_ops = &kcov_vm_ops;
+exit:
+ spin_unlock(&kcov->lock);
+ vfree(area);
+ return res;
+}
+
+static int kcov_open(struct inode *inode, struct file *filep)
+{
+ struct kcov *kcov;
+
+ kcov = kzalloc(sizeof(*kcov), GFP_KERNEL);
+ if (!kcov)
+ return -ENOMEM;
+ atomic_set(&kcov->rc, 1);
+ spin_lock_init(&kcov->lock);
+ filep->private_data = kcov;
+ return nonseekable_open(inode, filep);
+}
+
+static int kcov_close(struct inode *inode, struct file *filep)
+{
+ kcov_put(filep->private_data);
+ return 0;
+}
+
+static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
+ unsigned long arg)
+{
+ struct task_struct *t;
+
+ switch (cmd) {
+ case KCOV_INIT_TRACE:
+ /* Enable kcov in trace mode and setup buffer size.
+ * Must happen before anything else.
+ */
+ if (arg < 256 || arg > (128<<20) || arg & (arg - 1))
+ return -EINVAL;
+ if (kcov->mode != 0)
+ return -EBUSY;
+ kcov->mode = kcov_mode_trace;
+ kcov->size = arg;
+ return 0;
+ case KCOV_ENABLE:
+ /* Enable coverage for the current task.
+ * At this point user must have been enabled trace mode,
+ * and mmapped the file. Coverage collection is disabled only
+ * at task exit or voluntary by KCOV_DISABLE. After that it can
+ * be enabled for another task.
+ */
+ if (kcov->mode == 0 || kcov->area == NULL)
+ return -EINVAL;
+ if (kcov->t != NULL)
+ return -EBUSY;
+ t = current;
+ /* Cache in task struct for performance. */
+ t->kcov_size = kcov->size;
+ t->kcov_area = kcov->area;
+ /* See comment in __sanitizer_cov_trace_pc(). */
+ barrier();
+ WRITE_ONCE(t->kcov_mode, kcov->mode);
+ t->kcov = kcov;
+ kcov->t = t;
+ /* This is put either in kcov_task_exit() or in KCOV_DISABLE. */
+ atomic_inc(&kcov->rc);
+ return 0;
+ case KCOV_DISABLE:
+ /* Disable coverage for the current task. */
+ if (current->kcov != kcov)
+ return -EINVAL;
+ t = current;
+ BUG_ON(kcov->t != t);
+ kcov_task_init(t);
+ kcov->t = NULL;
+ BUG_ON(atomic_dec_and_test(&kcov->rc));
+ return 0;
+ default:
+ return -EINVAL;
+ }
+}
+
+static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+ struct kcov *kcov;
+ int res;
+
+ kcov = filep->private_data;
+ spin_lock(&kcov->lock);
+ res = kcov_ioctl_locked(kcov, cmd, arg);
+ spin_unlock(&kcov->lock);
+ return res;
+}
+
+static const struct file_operations kcov_fops = {
+ .open = kcov_open,
+ .unlocked_ioctl = kcov_ioctl,
+ .mmap = kcov_mmap,
+ .release = kcov_close,
+};
+
+static int __init kcov_init(void)
+{
+ if (!debugfs_create_file("kcov", 0666, NULL, NULL, &kcov_fops)) {
+ pr_err("init failed\n");
+ return 1;
+ }
+ return 0;
+}
+
+device_initcall(kcov_init);
diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 8e96f6c..f816de9 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -1,3 +1,6 @@
+# Any varying coverage in these files is non-deterministic
+# and is generally not a function of system call inputs.
+KCOV_INSTRUMENT := n

obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o

diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 61a1656..032b2c0 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -1,3 +1,7 @@
+# Any varying coverage in these files is non-deterministic
+# and is generally not a function of system call inputs.
+KCOV_INSTRUMENT := n
+
obj-y += update.o sync.o
obj-$(CONFIG_SRCU) += srcu.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 6768797..f0a9265 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -2,6 +2,10 @@ ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_clock.o = $(CC_FLAGS_FTRACE)
endif

+# These files are disabled because they produce non-interesting flaky coverage
+# that is not a function of syscall inputs. E.g. involuntary context switches.
+KCOV_INSTRUMENT := n
+
ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
# According to Alan Modra <alan@xxxxxxxxxxxxxxxx>, the -fno-omit-frame-pointer is
# needed for x86 only. Why this used to be enabled for all architectures is beyond
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e2a236a..facaa52 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -670,6 +670,28 @@ config DEBUG_STACKOVERFLOW

If in doubt, say "N".

+config ARCH_HAS_KCOV
+ bool
+ help
+ KCOV does not have any arch-specific code, but currently it is enabled
+ only for x86_64. KCOV requires testing on other archs, and most likely
+ disabling of instrumentation for some early boot code.
+
+config KCOV
+ bool "Code coverage for fuzzing"
+ depends on ARCH_HAS_KCOV && !RANDOMIZE_BASE
+ help
+ KCOV exposes kernel code coverage information in a form suitable
+ for coverage-guided fuzzing (randomized testing).
+
+ RANDOMIZE_BASE is not supported. KCOV exposes PC values that are meant
+ to be stable on different machines and across reboots. RANDOMIZE_BASE
+ breaks this assumption. Potentially it can be supported by subtracting
+ _stext from [_stext, _send), but it is more tricky (and slow) for
+ modules.
+
+ For more details, see Documentation/kcov.txt.
+
source "lib/Kconfig.kmemcheck"

source "lib/Kconfig.kasan"
diff --git a/lib/Makefile b/lib/Makefile
index 180dd4d..7af8687 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,15 @@ ORIG_CFLAGS := $(KBUILD_CFLAGS)
KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
endif

+# These files are disabled because they produce lots of non-interesting and/or
+# flaky coverage that is not a function of syscall inputs. For example,
+# rbtree can be global and individual rotations don't correlate with inputs.
+KCOV_INSTRUMENT_string.o := n
+KCOV_INSTRUMENT_rbtree.o := n
+KCOV_INSTRUMENT_list_debug.o := n
+KCOV_INSTRUMENT_debugobjects.o := n
+KCOV_INSTRUMENT_dynamic_debug.o := n
+
lib-y := ctype.o string.o vsprintf.o cmdline.o \
rbtree.o radix-tree.o dump_stack.o timerqueue.o\
idr.o int_sqrt.o extable.o \
diff --git a/mm/Makefile b/mm/Makefile
index 2ed4319..cf751bb 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -5,6 +5,21 @@
KASAN_SANITIZE_slab_common.o := n
KASAN_SANITIZE_slub.o := n

+# These files are disabled because they produce non-interesting and/or
+# flaky coverage that is not a function of syscall inputs. E.g. slab is out of
+# free pages, or a task is migrated between nodes.
+KCOV_INSTRUMENT_slab_common.o := n
+KCOV_INSTRUMENT_slob.o := n
+KCOV_INSTRUMENT_slab.o := n
+KCOV_INSTRUMENT_slub.o := n
+KCOV_INSTRUMENT_page_alloc.o := n
+KCOV_INSTRUMENT_debug-pagealloc.o := n
+KCOV_INSTRUMENT_kmemleak.o := n
+KCOV_INSTRUMENT_kmemcheck.o := n
+KCOV_INSTRUMENT_memcontrol.o := n
+KCOV_INSTRUMENT_mmzone.o := n
+KCOV_INSTRUMENT_vmstat.o := n
+
mmu-y := nommu.o
mmu-$(CONFIG_MMU) := gup.o highmem.o memory.o mincore.o \
mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index 6471014..ad97f0b 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -1,4 +1,5 @@
KASAN_SANITIZE := n
+KCOV_INSTRUMENT := n

CFLAGS_REMOVE_kasan.o = -pg
# Function splitter causes unnecessary splits in __asan_load1/__asan_store1
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 79e8661..ebf6f1b 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -129,6 +129,12 @@ _c_flags += $(if $(patsubst n%,, \
$(CFLAGS_KASAN))
endif

+ifeq ($(CONFIG_KCOV),y)
+_c_flags += $(if $(patsubst n%,, \
+ $(KCOV_INSTRUMENT_$(basetarget).o)$(KCOV_INSTRUMENT)y), \
+ $(CFLAGS_KCOV))
+endif
+
# If building the kernel in a separate objtree expand all occurrences
# of -Idir to -I$(srctree)/dir except for absolute paths (starting with '/').

--
2.6.0.rc2.230.g3dd15c0