Re: [PATCH] kcsan: Treat runtime as NMI-like with interrupt tracing

From: Marco Elver
Date: Mon Aug 17 2020 - 03:10:53 EST


On Tue, 11 Aug 2020 at 08:56, Marco Elver <elver@xxxxxxxxxx> wrote:
> On Mon, 10 Aug 2020 at 22:18, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > Marco Elver <elver@xxxxxxxxxx> writes:
> > > Since KCSAN instrumentation is everywhere, we need to treat the hooks
> > > NMI-like for interrupt tracing. In order to present an as 'normal' as
> > > possible context to the code called by KCSAN when reporting errors, we
> > > need to update the IRQ-tracing state.
> > >
> > > Tested: Several runs through kcsan-test with different configuration
> > > (PROVE_LOCKING on/off), as well as hours of syzbot testing with the
> > > original config that caught the problem (without CONFIG_PARAVIRT=y,
> > > which appears to cause IRQ state tracking inconsistencies even when
> > > KCSAN remains off, see Link).
> > >
> > > Link: https://lkml.kernel.org/r/0000000000007d3b2d05ac1c303e@xxxxxxxxxx
> > > Fixes: 248591f5d257 ("kcsan: Make KCSAN compatible with new IRQ state tracking")
> > > Reported-by: syzbot+8db9e1ecde74e590a657@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Co-developed-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > > Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> > > ---
> > > Patch Note: This patch applies to latest mainline. While current
> > > mainline suffers from the above problem, the configs required to hit the
> > > issue are likely not enabled too often (of course with PROVE_LOCKING on;
> > > we hit it on syzbot though). It'll probably be wise to queue this as
> > > normal on -rcu, just in case something is still off, given the
> > > non-trivial nature of the issue. (If it should instead go to mainline
> > > right now as a fix, I'd like some more test time on syzbot.)
> >
> > I'd rather stick it into mainline before -rc1.
> >
> > Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> Thank you, sounds good.
>
> FWIW I let it run on syzkaller over night once more, rebased against
> Sunday's mainline, and found no DEBUG_LOCKDEP issues. (It still found
> the known issue in irqentry_exit(), but is not specific to KCSAN:
> https://lore.kernel.org/lkml/000000000000e3068105ac405407@xxxxxxxxxx/)

I lost track of what's happening with the IRQ state tracking patches.
Do we still need this?

Or would Peter's new approach (to make raw->non-raw work) supersede this patch?
https://lkml.kernel.org/r/20200811201755.GI35926@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Which would appear to be the nicer solution.

Thanks,
-- Marco