Re: Kernel Concurrency Sanitizer (KCSAN)
From: Mark Rutland
Date: Thu Oct 03 2019 - 13:29:25 EST
On Fri, Sep 20, 2019 at 07:51:04PM +0200, Marco Elver wrote:
> On Fri, 20 Sep 2019 at 18:47, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > On Fri, Sep 20, 2019 at 6:31 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > >
> > > On Fri, Sep 20, 2019 at 04:18:57PM +0200, Marco Elver wrote:
> > > > We would like to share a new data-race detector for the Linux kernel:
> > > > Kernel Concurrency Sanitizer (KCSAN) --
> > > > https://github.com/google/ktsan/wiki/KCSAN (Details:
> > > > https://github.com/google/ktsan/blob/kcsan/Documentation/dev-tools/kcsan.rst)
> > >
> > > Nice!
> > >
> > > BTW kcsan_atomic_next() is missing a stub definition in <linux/kcsan.h>
> > > when !CONFIG_KCSAN:
> > >
> > > https://github.com/google/ktsan/commit/a22a093a0f0d0b582c82cdbac4f133a3f61d207c#diff-19d7c475b4b92aab8ba440415ab786ec
> > >
> > > ... and I think the kcsan_{begin,end}_atomic() stubs need to be static
> > > inline too.
>
> Thanks for catching, fixed and pushed. Feel free to rebase your arm64 branch.
Great; I've just done so!
What's the plan for posting a PATCH or RFC series?
The rest of this email is rabbit-holing on the issue KCSAN spotted;
sorry about that!
[...]
> > > We have some interesting splats at boot time in stop_machine, which
> > > don't seem to have been hit/fixed on x86 yet in the kcsan-with-fixes
> > > branch, e.g.
> > >
> > > [ 0.237939] ==================================================================
> > > [ 0.239431] BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0
> > > [ 0.241189]
> > > [ 0.241606] write to 0xffff00001003bd00 of 4 bytes by task 24 on cpu 3:
> > > [ 0.243435] set_state+0x80/0xb0
> > > [ 0.244328] multi_cpu_stop+0x16c/0x198
> > > [ 0.245406] cpu_stopper_thread+0x170/0x298
> > > [ 0.246565] smpboot_thread_fn+0x40c/0x560
> > > [ 0.247696] kthread+0x1a8/0x1b0
> > > [ 0.248586] ret_from_fork+0x10/0x18
> > > [ 0.249589]
> > > [ 0.250006] read to 0xffff00001003bd00 of 4 bytes by task 14 on cpu 1:
> > > [ 0.251804] multi_cpu_stop+0xa8/0x198
> > > [ 0.252851] cpu_stopper_thread+0x170/0x298
> > > [ 0.254008] smpboot_thread_fn+0x40c/0x560
> > > [ 0.255135] kthread+0x1a8/0x1b0
> > > [ 0.256027] ret_from_fork+0x10/0x18
> > > [ 0.257036]
> > > [ 0.257449] Reported by Kernel Concurrency Sanitizer on:
> > > [ 0.258918] CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-00007-g67ab35a199f4-dirty #3
> > > [ 0.261241] Hardware name: linux,dummy-virt (DT)
> > > [ 0.262517] ==================================================================>
>
> Thanks, the fixes in -with-fixes were ones I only encountered with
> Syzkaller, where I disable KCSAN during boot. I've just added a fix
> for this race and pushed to kcsan-with-fixes.
I think that's:
https://github.com/google/ktsan/commit/c1bc8ab013a66919d8347c2392f320feabb14f92
... but that doesn't look quite right to me, as it leaves us with the shape:
do {
if (READ_ONCE(msdata->state) != curstate) {
curstate = msdata->state;
switch (curstate) {
...
}
ack_state(msdata);
}
} while (curstate != MULTI_STOP_EXIT);
I don't believe that we have a guarantee of read-after-read ordering
between the READ_ONCE(msdata->state) and the subsequent plain access of
msdata->state, as we've been caught out on that in the past, e.g.
https://lore.kernel.org/lkml/1506527369-19535-1-git-send-email-will.deacon@xxxxxxx/
... which I think means we could switch on a stale value of
msdata->state. That would mean we might handle the same state twice,
calling ack_state() more times than expected and corrupting the count.
The compiler could also replace uses of curstate with a reload of
msdata->state. If it did so for the while condition, we could skip the
expected ack_state() for MULTI_STOP_EXIT, though it looks like that
might not matter.
I think we need to make sure that we use a consistent snapshot,
something like the below. Assuming I'm not barking up the wrong tree, I
can spin this as a proper patch.
Thanks,
Mark.
---->8----
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index b4f83f7bdf86..67a0b454b5b5 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -167,7 +167,7 @@ static void set_state(struct multi_stop_data *msdata,
/* Reset ack counter. */
atomic_set(&msdata->thread_ack, msdata->num_threads);
smp_wmb();
- msdata->state = newstate;
+ WRITE_ONCE(msdata->state, newstate);
}
/* Last one to ack a state moves to the next state. */
@@ -186,7 +186,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask)
static int multi_cpu_stop(void *data)
{
struct multi_stop_data *msdata = data;
- enum multi_stop_state curstate = MULTI_STOP_NONE;
+ enum multi_stop_state newstate, curstate = MULTI_STOP_NONE;
int cpu = smp_processor_id(), err = 0;
const struct cpumask *cpumask;
unsigned long flags;
@@ -210,8 +210,9 @@ static int multi_cpu_stop(void *data)
do {
/* Chill out and ensure we re-read multi_stop_state. */
stop_machine_yield(cpumask);
- if (msdata->state != curstate) {
- curstate = msdata->state;
+ newstate = READ_ONCE(msdata->state);
+ if (newstate != curstate) {
+ curstate = newstate;
switch (curstate) {
case MULTI_STOP_DISABLE_IRQ:
local_irq_disable();