Re: [PATCH v2 2/4] llvm-cov: add Clang's MC/DC support
From: Nathan Chancellor
Date: Tue Oct 01 2024 - 21:10:49 EST
Hi Wentao,
On Wed, Sep 04, 2024 at 11:32:43PM -0500, Wentao Zhang wrote:
> Add infrastructure to enable Clang's Modified Condition/Decision Coverage
> (MC/DC) [1].
>
> Clang has added MC/DC support as of its 18.1.0 release. MC/DC is a fine-
> grained coverage metric required by many automotive and aviation industrial
> standards for certifying mission-critical software [2].
>
> In the following example from arch/x86/events/probe.c, llvm-cov gives the
> MC/DC measurement for the compound logic decision at line 43.
>
> 43| 12| if (msr[bit].test && !msr[bit].test(bit, data))
> ------------------
> |---> MC/DC Decision Region (43:8) to (43:50)
> |
> | Number of Conditions: 2
> | Condition C1 --> (43:8)
> | Condition C2 --> (43:25)
> |
> | Executed MC/DC Test Vectors:
> |
> | C1, C2 Result
> | 1 { T, F = F }
> | 2 { T, T = T }
> |
> | C1-Pair: not covered
> | C2-Pair: covered: (1,2)
> | MC/DC Coverage for Decision: 50.00%
> |
> ------------------
> 44| 5| continue;
>
> As the results suggest, during the span of measurement, only condition C2
> (!msr[bit].test(bit, data)) is covered. That means C2 was evaluated to both
> true and false, and in those test vectors C2 affected the decision outcome
> independently. Therefore MC/DC for this decision is 1 out of 2 (50.00%).
Thanks a lot for the detail in the commit message. Your first talk at
LPC in the Refereed Track was excellent as well. If the video for that
talk becomes available soon, it would be helpful to link that in the
commit message as well.
> As of Clang 19, users can determine the max number of conditions in a
> decision to measure via option LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS, which
> controls -fmcdc-max-conditions flag of Clang cc1 [3]. Since MC/DC
> implementation utilizes bitmaps to track the execution of test vectors,
> more memory is consumed if larger decisions are getting counted. The
Some of this could potentially be in the Kconfig text below as it seems
relevant for users to make a decision on modifying its value.
> maximum value supported by Clang is 32767. According to local experiments,
> the working maximum for Linux kernel is 46, with the largest decisions in
> kernel codebase (with 47 conditions, as of v6.11) excluded, otherwise the
> kernel image size limit will be exceeded. The largest decisions in kernel
> are contributed for example by macros checking CPUID.
>
> Code exceeding LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce compiler
> warnings.
>
> As of LLVM 19, certain expressions are still not covered, and will produce
> build warnings when they are encountered:
>
> "[...] if a boolean expression is embedded in the nest of another boolean
> expression but separated by a non-logical operator, this is also not
> supported. For example, in x = (a && b && c && func(d && f)), the d && f
> case starts a new boolean expression that is separated from the other
> conditions by the operator func(). When this is encountered, a warning
> will be generated and the boolean expression will not be
> instrumented." [4]
These two sets of warnings appear to be pretty noisy in my build
testing... Is there any way to shut them up? Perhaps it is good for
users to see these limitations but it basically makes the build output
useless. If there were switches, then they could be disabled in the
default case with a Kconfig option to turn them on if the user is
concerned with seeing which parts of their code are not instrumented. I
could see developers wanting to run this for writing tests and they
might not care about this as much as someone else might.
I did leave LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS at its default value.
Perhaps there is a more reasonable default that would result in less
noisy build output but not run afoul of potential memory usage concerns?
I assume that mention means that memory usage may be a concern for the
type of deployments this technology would commonly be used with?
> Link: https://en.wikipedia.org/wiki/Modified_condition%2Fdecision_coverage [1]
> Link: https://digital-library.theiet.org/content/journals/10.1049/sej.1994.0025 [2]
> Link: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798 [3]
> Link: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#mc-dc-instrumentation [4]
Thank you for using this link format :)
> Signed-off-by: Wentao Zhang <wentaoz5@xxxxxxxxxxxx>
> Reviewed-by: Chuck Wolber <chuck.wolber@xxxxxxxxxx>
> Tested-by: Chuck Wolber <chuck.wolber@xxxxxxxxxx>
>From an actual code perspective, this looks good to me.
Reviewed-by: Nathan Chancellor <nathan@xxxxxxxxxx>
> diff --git a/Makefile b/Makefile
> index 51498134c..1185b38d6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -740,6 +740,12 @@ all: vmlinux
> CFLAGS_LLVM_COV := -fprofile-instr-generate -fcoverage-mapping
> export CFLAGS_LLVM_COV
>
> +CFLAGS_LLVM_COV_MCDC := -fcoverage-mcdc
> +ifdef CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS
> +CFLAGS_LLVM_COV_MCDC += -Xclang -fmcdc-max-conditions=$(CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS)
Why is -Xclang needed here? Is this not a full frontend flag?
> +endif
> +export CFLAGS_LLVM_COV_MCDC
> +
> CFLAGS_GCOV := -fprofile-arcs -ftest-coverage
> ifdef CONFIG_CC_IS_GCC
> CFLAGS_GCOV += -fno-tree-loop-im