Re: [PATCH v2 1/3] kcov: support comparison operands collection

From: Dmitry Vyukov
Date: Mon Oct 09 2017 - 14:15:37 EST


On Mon, Oct 9, 2017 at 5:46 PM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> Hi,
>
> I look forward to using this! :)
>
> I just have afew comments below.
>
> On Mon, Oct 09, 2017 at 05:05:19PM +0200, Alexander Potapenko wrote:
>> +/*
>> + * Defines the format for the types of collected comparisons.
>> + */
>> +enum kcov_cmp_type {
>> + /*
>> + * LSB shows whether one of the arguments is a compile-time constant.
>> + */
>> + KCOV_CMP_CONST = 1,
>> + /*
>> + * Second and third LSBs contain the size of arguments (1/2/4/8 bytes).
>> + */
>> + KCOV_CMP_SIZE1 = 0,
>> + KCOV_CMP_SIZE2 = 2,
>> + KCOV_CMP_SIZE4 = 4,
>> + KCOV_CMP_SIZE8 = 6,
>> + KCOV_CMP_SIZE_MASK = 6,
>> +};
>
> Given that LSB is meant to be OR-ed in, (and hence combinations of
> values are meaningful) I don't think it makes sense for this to be an
> enum. This would clearer as something like:
>
> /*
> * The format for the types of collected comparisons.
> *
> * Bit 0 shows whether one of the arguments is a compile-time constant.
> * Bits 1 & 2 contain log2 of the argument size, up to 8 bytes.
> */
> #define KCOV_CMP_CONST (1 << 0)
> #define KCOV_CMP_SIZE(n) ((n) << 1)
> #define KCOV_CMP_MASK KCOV_CMP_SIZE(3)
>
> ... I note that a few places in the kernel use a 128-bit type. Are
> 128-bit comparisons not instrumented?

Yes, they are not instrumented.
How many are there? Can you give some examples?



>> +static bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
>> +{
>> + enum kcov_mode mode;
>> +
>> + /*
>> + * We are interested in code coverage as a function of a syscall inputs,
>> + * so we ignore code executed in interrupts.
>> + */
>> + if (!t || !in_task())
>> + return false;
>
> This !t check can go, as with the one in __sanitizer_cov_trace_pc, since
> t is always current, and therefore cannot be NULL.
>
> IIRC there's a patch queued for that, which this may conflict with.
>
>> + mode = READ_ONCE(t->kcov_mode);
>> + /*
>> + * There is some code that runs in interrupts but for which
>> + * in_interrupt() returns false (e.g. preempt_schedule_irq()).
>> + * READ_ONCE()/barrier() effectively provides load-acquire wrt
>> + * interrupts, there are paired barrier()/WRITE_ONCE() in
>> + * kcov_ioctl_locked().
>> + */
>> + barrier();
>> + if (mode != needed_mode)
>> + return false;
>> + return true;
>
> This would be simpler as:
>
> return mode == needed_mode;
>
> [...]
>
>> + area = t->kcov_area;
>> + /* The first 64-bit word is the number of subsequent PCs. */
>> + pos = READ_ONCE(area[0]) + 1;
>> + if (likely(pos < t->kcov_size)) {
>> + area[pos] = ip;
>> + WRITE_ONCE(area[0], pos);
>
> Not a new problem, but if the area for one thread is mmap'd, and read by
> another thread, these two writes could be seen out-of-order, since we
> don't have an smp_wmb() between them.
>
> I guess Syzkaller doesn't read the mmap'd kcov file from another thread?


Yes, that's the intention. If you read coverage from another thread,
you can't know coverage from what exactly you read. So the usage
pattern is:

reset coverage;
do something;
read coverage;



>
>> }
>> }
>> EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
>>
>> +#ifdef CONFIG_KCOV_ENABLE_COMPARISONS
>> +static void write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
>> +{
>> + struct task_struct *t;
>> + u64 *area;
>> + u64 count, start_index, end_pos, max_pos;
>> +
>> + t = current;
>> + if (!check_kcov_mode(KCOV_MODE_TRACE_CMP, t))
>> + return;
>> +
>> +#ifdef CONFIG_RANDOMIZE_BASE
>> + ip -= kaslr_offset();
>> +#endif
>
> Given we have this in two places, it might make sense to have a helper
> like:
>
> unsigned long canonicalize_ip(unsigned long ip)
> {
> #ifdef CONFIG_RANDOMIZE_BASE
> ip -= kaslr_offset();
> #endif
> return ip;
> }
>
> ... to minimize the ifdeffery elsewhere.
>
>> +
>> + /*
>> + * We write all comparison arguments and types as u64.
>> + * The buffer was allocated for t->kcov_size unsigned longs.
>> + */
>> + area = (u64 *)t->kcov_area;
>> + max_pos = t->kcov_size * sizeof(unsigned long);
>> +
>> + count = READ_ONCE(area[0]);
>> +
>> + /* Every record is KCOV_WORDS_PER_CMP 64-bit words. */
>> + start_index = 1 + count * KCOV_WORDS_PER_CMP;
>> + end_pos = (start_index + KCOV_WORDS_PER_CMP) * sizeof(u64);
>> + if (likely(end_pos <= max_pos)) {
>> + area[start_index] = type;
>> + area[start_index + 1] = arg1;
>> + area[start_index + 2] = arg2;
>> + area[start_index + 3] = ip;
>> + WRITE_ONCE(area[0], count + 1);
>
> That ordering problem applies here, too.
>
> Thanks,
> Mark.