Re: Kernel Concurrency Sanitizer (KCSAN)
From: Joel Fernandes
Date: Fri Oct 04 2019 - 12:57:41 EST
On Fri, Oct 04, 2019 at 06:52:49PM +0200, Dmitry Vyukov wrote:
> On Fri, Oct 4, 2019 at 6:49 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, Oct 02, 2019 at 09:51:58PM +0200, Marco Elver wrote:
> > > Hi Joel,
> > >
> > > On Tue, 1 Oct 2019 at 23:19, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Fri, Sep 20, 2019 at 04:18:57PM +0200, Marco Elver wrote:
> > > > > Hi all,
> > > > >
> > > > > We would like to share a new data-race detector for the Linux kernel:
> > > > > Kernel Concurrency Sanitizer (KCSAN) --
> > > > > https://github.com/google/ktsan/wiki/KCSAN (Details:
> > > > > https://github.com/google/ktsan/blob/kcsan/Documentation/dev-tools/kcsan.rst)
> > > > >
> > > > > To those of you who we mentioned at LPC that we're working on a
> > > > > watchpoint-based KTSAN inspired by DataCollider [1], this is it (we
> > > > > renamed it to KCSAN to avoid confusion with KTSAN).
> > > > > [1] http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf
> > > > >
> > > > > In the coming weeks we're planning to:
> > > > > * Set up a syzkaller instance.
> > > > > * Share the dashboard so that you can see the races that are found.
> > > > > * Attempt to send fixes for some races upstream (if you find that the
> > > > > kcsan-with-fixes branch contains an important fix, please feel free to
> > > > > point it out and we'll prioritize that).
> > > > >
> > > > > There are a few open questions:
> > > > > * The big one: most of the reported races are due to unmarked
> > > > > accesses; prioritization or pruning of races to focus initial efforts
> > > > > to fix races might be required. Comments on how best to proceed are
> > > > > welcome. We're aware that these are issues that have recently received
> > > > > attention in the context of the LKMM
> > > > > (https://lwn.net/Articles/793253/).
> > > > > * How/when to upstream KCSAN?
> > > >
> > > > Looks exciting. I think based on our discussion at LPC, you mentioned
> > > > one way of pruning is if the compiler generated different code with _ONCE
> > > > annotations than what would have otherwise been generated. Is that still on
> > > > the table, for the purposing of pruning the reports?
> > >
> > > This might be interesting at first, but it's not entirely clear how
> > > feasible it is. It's also dangerous, because the real issue would be
> > > ignored. It may be that one compiler version on a particular
> > > architecture generates the same code, but any change in compiler or
> > > architecture and this would no longer be true. Let me know if you have
> > > any more ideas.
> >
> > My thought was this technique of looking at compiler generated code can be
> > used for prioritization of the reports. Have you tested it though? I think
> > without testing such technique, we could not know how much of benefit (or
> > lack thereof) there is to the issue.
> >
> > In fact, IIRC, the compiler generating different code with _ONCE annotation
> > can be given as justification for patches doing such conversions.
>
>
> We also should not forget about "missed mutex" races (e.g. unprotected
> radix tree), which are much worse and higher priority than a missed
> atomic annotation. If we look at codegen we may discard most of them
> as non important.
Sure. I was not asking to look at codegen as the only signal. But to use the
signal for whatever it is worth.
thanks,
- Joel