Re: [PATCH v2 1/4] llvm-cov: add Clang's Source-based Code Coverage support
From: Nathan Chancellor
Date: Tue Oct 01 2024 - 20:31:10 EST
Hi Wentao,
On Wed, Sep 04, 2024 at 11:32:42PM -0500, Wentao Zhang wrote:
> Add infrastructure to support Clang's source-based code coverage [1]. This
> includes debugfs entries for serializing profiles and resetting
> counters/bitmaps. Also adds coverage flags and kconfig options.
Some superficial comments below. My general understanding of
implementing a standalone compiler-rt interface is not super deep, so I
won't really comment on that code, but it seems generally fine to me.
>From a coding style perspective, everything seems reasonable, assuming
it passes checkpatch.pl (I didn't check). The Kbuild bits look good as
well, they match the gcov implementation.
As most of my comments are more nits than anything else, feel free to
carry this over:
Reviewed-by: Nathan Chancellor <nathan@xxxxxxxxxx>
> diff --git a/kernel/llvm-cov/Kconfig b/kernel/llvm-cov/Kconfig
> new file mode 100644
> index 000000000..9241fdfb0
> --- /dev/null
> +++ b/kernel/llvm-cov/Kconfig
> @@ -0,0 +1,64 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +menu "Clang's source-based kernel coverage measurement (EXPERIMENTAL)"
> +
> +config ARCH_HAS_LLVM_COV
> + bool
> +
> +config ARCH_HAS_LLVM_COV_PROFILE_ALL
> + bool
It might be worth mentioning somewhere in here what architectures need
to do to implement this in their architectures. I can surmise from the
x86 patches that it appears to just be excluding files from
instrumentation that need to be free from it and adding the sections to
their linker file but if there are any gotchas, documentation would help
other architectures impelement this, since it seems pretty useful for
development.
> +config LLVM_COV_KERNEL
> + bool "Enable Clang's source-based kernel coverage measurement"
> + depends on DEBUG_FS
> + depends on ARCH_HAS_LLVM_COV
> + depends on CC_IS_CLANG && CLANG_VERSION >= 180000
> + default n
Drop this, it is the default.
> + help
> + This option enables Clang's Source-based Code Coverage.
> +
> + If unsure, say N.
> +
> + On a kernel compiled with this option, run your test suites, and
> + download the raw profile from /sys/kernel/debug/llvm-cov/profraw.
> + This file can then be converted into the indexed format with
> + llvm-profdata and used to generate coverage reports with llvm-cov.
> +
> + Additionally specify CONFIG_LLVM_COV_PROFILE_ALL=y to get profiling
> + data for the entire kernel. To enable profiling for specific files or
> + directories, add a line similar to the following to the respective
> + Makefile:
> +
> + For a single file (e.g. main.o):
> + LLVM_COV_PROFILE_main.o := y
> +
> + For all files in one directory:
> + LLVM_COV_PROFILE := y
> +
> + To exclude files from being profiled even when
> + CONFIG_LLVM_COV_PROFILE_ALL is specified, use:
> +
> + LLVM_COV_PROFILE_main.o := n
> + and:
> + LLVM_COV_PROFILE := n
> +
> + Note that a kernel compiled with coverage flags will be significantly
> + larger and run slower.
> +
> + Note that the debugfs filesystem has to be mounted to access the raw
> + profile.
> +
> +config LLVM_COV_PROFILE_ALL
> + bool "Profile entire Kernel"
> + depends on !COMPILE_TEST
> + depends on LLVM_COV_KERNEL
> + depends on ARCH_HAS_LLVM_COV_PROFILE_ALL
> + default n
Ditto as above.
> + help
> + This options activates profiling for the entire kernel.
> +
> + If unsure, say N.
> +
> + Note that a kernel compiled with profiling flags will be significantly
> + larger and run slower.
> +
> +endmenu
> diff --git a/kernel/llvm-cov/Makefile b/kernel/llvm-cov/Makefile
> new file mode 100644
> index 000000000..f6a236562
> --- /dev/null
> +++ b/kernel/llvm-cov/Makefile
> @@ -0,0 +1,5 @@
> +# SPDX-License-Identifier: GPL-2.0
> +GCOV_PROFILE := n
Adding this seems to imply that GCOV and llvm-cov can be active at the
same time and work together? Does that make sense? Are there things that
GCOV can track that llvm-cov cannot? If any of the answers to those
questions are no, would it make sense to make these mutually exclusive
via Kconfig?
> +LLVM_COV_PROFILE := n
> +
> +obj-y += fs.o
...
> diff --git a/kernel/llvm-cov/llvm-cov.h b/kernel/llvm-cov/llvm-cov.h
> new file mode 100644
> index 000000000..d9551a685
> --- /dev/null
> +++ b/kernel/llvm-cov/llvm-cov.h
> @@ -0,0 +1,156 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Sami Tolvanen <samitolvanen@xxxxxxxxxx>, Google, Inc.
> + * Copyright (C) 2024 Jinghao Jia <jinghao7@xxxxxxxxxxxx>, UIUC
> + * Copyright (C) 2024 Wentao Zhang <wentaoz5@xxxxxxxxxxxx>, UIUC
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
FWIW, I think you can drop this boilerplate text when you include the
SPDX indentifier:
"An alternative to boilerplate text is the use of Software Package Data
Exchange (SPDX) license identifiers in each source file."
https://www.kernel.org/doc/html/latest/process/license-rules.html
> +#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION >= 190000
> +#define INSTR_PROF_RAW_VERSION 10
> +#define INSTR_PROF_DATA_ALIGNMENT 8
> +#define IPVK_LAST 2
> +#elif defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION >= 180000
> +#define INSTR_PROF_RAW_VERSION 9
> +#define INSTR_PROF_DATA_ALIGNMENT 8
> +#define IPVK_LAST 1
> +#endif
It might be interesting to note the commit from LLVM 19 that introduced
the need for this change and what happens when it is not updated so that
any future breakage can be quickly addressed by people other than
yourself.
Cheers,
Nathan