Re: Kernel Concurrency Sanitizer (KCSAN)

From: Joel Fernandes
Date: Fri Oct 04 2019 - 12:49:03 EST


On Wed, Oct 02, 2019 at 09:51:58PM +0200, Marco Elver wrote:
> Hi Joel,
>
> On Tue, 1 Oct 2019 at 23:19, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Sep 20, 2019 at 04:18:57PM +0200, Marco Elver wrote:
> > > Hi all,
> > >
> > > We would like to share a new data-race detector for the Linux kernel:
> > > Kernel Concurrency Sanitizer (KCSAN) --
> > > https://github.com/google/ktsan/wiki/KCSAN (Details:
> > > https://github.com/google/ktsan/blob/kcsan/Documentation/dev-tools/kcsan.rst)
> > >
> > > To those of you who we mentioned at LPC that we're working on a
> > > watchpoint-based KTSAN inspired by DataCollider [1], this is it (we
> > > renamed it to KCSAN to avoid confusion with KTSAN).
> > > [1] http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf
> > >
> > > In the coming weeks we're planning to:
> > > * Set up a syzkaller instance.
> > > * Share the dashboard so that you can see the races that are found.
> > > * Attempt to send fixes for some races upstream (if you find that the
> > > kcsan-with-fixes branch contains an important fix, please feel free to
> > > point it out and we'll prioritize that).
> > >
> > > There are a few open questions:
> > > * The big one: most of the reported races are due to unmarked
> > > accesses; prioritization or pruning of races to focus initial efforts
> > > to fix races might be required. Comments on how best to proceed are
> > > welcome. We're aware that these are issues that have recently received
> > > attention in the context of the LKMM
> > > (https://lwn.net/Articles/793253/).
> > > * How/when to upstream KCSAN?
> >
> > Looks exciting. I think based on our discussion at LPC, you mentioned
> > one way of pruning is if the compiler generated different code with _ONCE
> > annotations than what would have otherwise been generated. Is that still on
> > the table, for the purposing of pruning the reports?
>
> This might be interesting at first, but it's not entirely clear how
> feasible it is. It's also dangerous, because the real issue would be
> ignored. It may be that one compiler version on a particular
> architecture generates the same code, but any change in compiler or
> architecture and this would no longer be true. Let me know if you have
> any more ideas.

My thought was this technique of looking at compiler generated code can be
used for prioritization of the reports. Have you tested it though? I think
without testing such technique, we could not know how much of benefit (or
lack thereof) there is to the issue.

In fact, IIRC, the compiler generating different code with _ONCE annotation
can be given as justification for patches doing such conversions.

thanks,

- Joel