Re: [PATCH 1/3] kcov: support comparison operands collection

From: Alexander Potapenko
Date: Wed Aug 30 2017 - 15:04:29 EST


On Wed, Aug 30, 2017 at 8:23 PM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> Hi,
>
> On Wed, Aug 30, 2017 at 06:23:29PM +0200, Dmitry Vyukov wrote:
>> From: Victor Chibotaru <tchibo@xxxxxxxxxx>
>>
>> Enables kcov to collect comparison operands from instrumented code.
>> This is done by using Clang's -fsanitize=trace-cmp instrumentation
>> (currently not available for GCC).
Hi Mark,

> What's needed to build the kernel with Clang these days?
Shameless plug: https://github.com/ramosian-glider/clang-kernel-build
We are currently one patch away from booting the Clang-built kernel on x86.
The patch in my repo is far from perfect, Josh Poimboeuf has a better
one at https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=ASM_CALL

> I was under the impression that it still wasn't possible to build arm64
> with clang due to a number of missing features (e.g. the %a assembler
> output template).
Sorry, I haven't experimented with ARM much. Nick Desaulniers or Greg
Hackmann should know the details, I'll ask them.
I think the current status is "almost there".
>> The comparison operands help a lot in fuzz testing. E.g. they are
>> used in Syzkaller to cover the interiors of conditional statements
>> with way less attempts and thus make previously unreachable code
>> reachable.
>>
>> To allow separate collection of coverage and comparison operands two
>> different work modes are implemented. Mode selection is now done via
>> a KCOV_ENABLE ioctl call with corresponding argument value.
>>
>> Signed-off-by: Victor Chibotaru <tchibo@xxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>> Cc: Alexander Popov <alex.popov@xxxxxxxxx>
>> Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx>
>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>> Cc: Vegard Nossum <vegard.nossum@xxxxxxxxxx>
>> Cc: Quentin Casasnovas <quentin.casasnovas@xxxxxxxxxx>
>> Cc: syzkaller@xxxxxxxxxxxxxxxx
>> Cc: linux-mm@xxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> ---
>> Clang instrumentation:
>> https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow
>
> How stable is this?
>
> The comment at the end says "This interface is a subject to change."
>
> [...]
>
>> diff --git a/kernel/kcov.c b/kernel/kcov.c
>> index cd771993f96f..2abce5dfa2df 100644
>> --- a/kernel/kcov.c
>> +++ b/kernel/kcov.c
>> @@ -21,13 +21,21 @@
>> #include <linux/kcov.h>
>> #include <asm/setup.h>
>>
>> +/* Number of words written per one comparison. */
>> +#define KCOV_WORDS_PER_CMP 3
>
> Could you please expand the comment to cover what a "word" is?
>
> Generally, "word" is an ambiguous term, and it's used inconsitently in
> this file as of this patch. For comparison coverage, a "word" is assumed
> to always be 64-bit, (which makes sxense given 64-bit comparisons), but
> for branch coverage a "word" refers to an unsigned long, which would be
> 32-bit on a 32-bit platform.
>
> [...]
>
>> +static bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
>
> Perhaps kcov_mode_is_active()?
>
> That would better describe what is being checked.
>
>> +{
>> + enum kcov_mode mode;
>> +
>> + /*
>> + * We are interested in code coverage as a function of a syscall inputs,
>> + * so we ignore code executed in interrupts.
>> + */
>> + if (!t || !in_task())
>> + return false;
>> + mode = READ_ONCE(t->kcov_mode);
>> + /*
>> + * There is some code that runs in interrupts but for which
>> + * in_interrupt() returns false (e.g. preempt_schedule_irq()).
>> + * READ_ONCE()/barrier() effectively provides load-acquire wrt
>> + * interrupts, there are paired barrier()/WRITE_ONCE() in
>> + * kcov_ioctl_locked().
>> + */
>> + barrier();
>> + if (mode != needed_mode)
>> + return false;
>> + return true;
>
> This would be simlper as:
>
> barrier();
> return mode == needed_mode;
>
> [...]
>
>> +#ifdef CONFIG_KCOV_ENABLE_COMPARISONS
>> +static void write_comp_data(u64 type, u64 arg1, u64 arg2)
>> +{
>> + struct task_struct *t;
>> + u64 *area;
>> + u64 count, start_index, end_pos, max_pos;
>> +
>> + t = current;
>> + if (!check_kcov_mode(KCOV_MODE_TRACE_CMP, t))
>> + return;
>> +
>> + /*
>> + * We write all comparison arguments and types as u64.
>> + * The buffer was allocated for t->kcov_size unsigned longs.
>> + */
>> + area = (u64 *)t->kcov_area;
>> + max_pos = t->kcov_size * sizeof(unsigned long);
>
> Perhaps it would make more sense for k->kcov_size to be in bytes, if
> different options will have differing record sizes?
>
>> +
>> + count = READ_ONCE(area[0]);
>> +
>> + /* Every record is KCOV_WORDS_PER_CMP words. */
>
> As above, please be explicit about what a "word" is, or avoid using
> "word" terminology.
>
> Thanks,
> Mark.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@xxxxxxxxxxxxxxxxx
> For more options, visit https://groups.google.com/d/optout.



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-StraÃe, 33
80636 MÃnchen

GeschÃftsfÃhrer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg