Re: KCSAN: data-race in __alloc_file / __alloc_file

From: Marco Elver
Date: Tue Nov 12 2019 - 17:05:43 EST


On Tue, 12 Nov 2019 at 22:13, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 12, 2019 at 12:58 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Honestly, my preferred model would have been to just add a comment,
> > and have the reporting tool know to then just ignore it. So something
> > like
> >
> > + // Benign data-race on min_flt
> > tsk->min_flt++;
> > perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
> >
> > for the case that Eric mentioned - the tool would trigger on
> > "data-race", and the rest of the comment could/should be for humans.
> > Without making the code uglier, but giving the potential for a nice
> > leghibl.e explanation instead of a completely illegible "let's
> > randomly use WRITE_ONCE() here" or something like that.
>
> Hmm. Looking at the practicality of this, it actually doesn't look
> *too* horrible.
>
> I note that at least clang already has a "--blacklist" ability. I
> didn't find a list of complete syntax for that, and it looks like it
> might be just "whole functions" or "whole source files", but maybe the
> clang people would be willing to add "file and line ranges" to the
> blacklists?
>
> Then you could generate the blacklist with that trivial grep before
> you start the build, and -fsanitize=thread would automatically simply
> not look at those lines.
>
> For a simple first case, maybe the rule could be that the comment has
> to be on the line. A bit less legible for humans, but it could be
>
> - tsk->min_flt++;
> + // Benign race min_flt - statistics only
> + tsk->min_flt++; // data-race
>
> instead.
>
> Wouldn't that be a much better annotation than having to add code?

Thanks for the suggestion.

Right now I can't say what the most reliable way to do this for KCSAN
is. Doing this through the compiler doesn't seem possible today, but
is something to look into. An alternative is to preprocess the code
based on comments somehow.

How many variations of such comments could exist?

If it's only one or two, as a counter suggestion, would a macro not be
more reliable? A macro would provide a uniform way to document intent,
but could otherwise be a no-op. The tool would have no problems
understanding the macro. For example "APPROX(tsk->min_flt++)" or
something else that documents that the computation can be approximate
e.g. in the presence of races.

Thanks,
-- Marco