Re: [PATCH 1/8] kcsan: Add Kernel Concurrency Sanitizer infrastructure
From: Marco Elver
Date: Wed Oct 16 2019 - 15:34:26 EST
On Wed, 16 Oct 2019 at 20:44, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Oct 16, 2019 at 10:39:52AM +0200, Marco Elver wrote:
>
> > +bool __kcsan_check_watchpoint(const volatile void *ptr, size_t size,
> > + bool is_write)
> > +{
> > + atomic_long_t *watchpoint;
> > + long encoded_watchpoint;
> > + unsigned long flags;
> > + enum kcsan_report_type report_type;
> > +
> > + if (unlikely(!is_enabled()))
> > + return false;
> > +
> > + watchpoint = find_watchpoint((unsigned long)ptr, size, !is_write,
> > + &encoded_watchpoint);
> > + if (watchpoint == NULL)
> > + return true;
> > +
> > + flags = user_access_save();
>
> Could use a comment on why find_watchpoint() is save to call without
> user_access_save() on.
Thanks, will add a comment for v2.
> > + if (!try_consume_watchpoint(watchpoint, encoded_watchpoint)) {
> > + /*
> > + * The other thread may not print any diagnostics, as it has
> > + * already removed the watchpoint, or another thread consumed
> > + * the watchpoint before this thread.
> > + */
> > + kcsan_counter_inc(kcsan_counter_report_races);
> > + report_type = kcsan_report_race_check_race;
> > + } else {
> > + report_type = kcsan_report_race_check;
> > + }
> > +
> > + /* Encountered a data-race. */
> > + kcsan_counter_inc(kcsan_counter_data_races);
> > + kcsan_report(ptr, size, is_write, raw_smp_processor_id(), report_type);
> > +
> > + user_access_restore(flags);
> > + return false;
> > +}
> > +EXPORT_SYMBOL(__kcsan_check_watchpoint);
> > +
> > +void __kcsan_setup_watchpoint(const volatile void *ptr, size_t size,
> > + bool is_write)
> > +{
> > + atomic_long_t *watchpoint;
> > + union {
> > + u8 _1;
> > + u16 _2;
> > + u32 _4;
> > + u64 _8;
> > + } expect_value;
> > + bool is_expected = true;
> > + unsigned long ua_flags = user_access_save();
> > + unsigned long irq_flags;
> > +
> > + if (!should_watch(ptr))
> > + goto out;
> > +
> > + if (!check_encodable((unsigned long)ptr, size)) {
> > + kcsan_counter_inc(kcsan_counter_unencodable_accesses);
> > + goto out;
> > + }
> > +
> > + /*
> > + * Disable interrupts & preemptions, to ignore races due to accesses in
> > + * threads running on the same CPU.
> > + */
> > + local_irq_save(irq_flags);
> > + preempt_disable();
>
> Is there a point to that preempt_disable() here?
We want to avoid being preempted while the watchpoint is set up;
otherwise, we would report data-races for CPU-local data, which is
incorrect. An alternative would be adding the source CPU to the
watchpoint, and checking that the CPU != this_cpu. There are several
problems with that alternative:
1. We do not want to steal more bits from the watchpoint encoding for
things other than read/write, size, and address, as not only does it
affect accuracy, it would also increase performance overhead in the
fast-path.
2. As a consequence, if we get a preemption and run a task on the same
CPU, and there *is* a genuine data-race, we would *not* report it; and
since this is the common case (and not accesses to CPU-local data), it
makes more sense (from a data-race detection PoV) to simply disable
preemptions and ensure that all tasks are run on other CPUs as well as
avoid the problem of point (1).
I can add a comment to that effect here for v2.
Thanks,
-- Marco