Re: arm64: allmodconfig: BUG: KCSAN: data-race in p9_client_cb / p9_client_rpc
From: Marco Elver
Date: Sat Dec 03 2022 - 11:50:11 EST
On Sat, 3 Dec 2022 at 16:37, Dominique Martinet <asmadeus@xxxxxxxxxxxxx> wrote:
>
> (reply out of order)
>
> Naresh Kamboju wrote on Thu, Dec 01, 2022 at 01:13:25PM +0530:
> > > (You might need to build with at least CONFIG_DEBUG_INFO_REDUCED (or not
> > > reduced), but that is on by default for aarch64)
> >
> > Thanks for the suggestions.
> > The Kconfig is enabled now.
> > CONFIG_DEBUG_INFO_REDUCED=y
>
> It looks enabled in your the config file you linked at, I don't
> understand this remark?
> Did you produce the trace the other day without it and rebuild the
> kernel with it?
> In this case you also have CONFIG_DEBUG_INFO_SPLIT set, so the vmlinux
> file does not contain enough informations to retrieve line numbers or
> types, and in particular addr2line cannot be used on the files you
> provided.
> I've never used split debug infos before, but digging old threads I'm
> not too hopeful unless that changed:
> https://lkml.iu.edu/hypermail/linux/kernel/1711.1/03393.html
> https://sourceware.org/bugzilla/show_bug.cgi?id=22434
>
> (...a test build later, it's still mostly useless...
> normal build
> $ ./scripts/faddr2line vmlinux __schedule+0x314
> __schedule+0x314/0x6c0:
> perf_fetch_caller_regs at include/linux/perf_event.h:1286
> (inlined by) __perf_sw_event_sched at include/linux/perf_event.h:1307
> (inlined by) perf_event_task_sched_out at include/linux/perf_event.h:1347
> (inlined by) prepare_task_switch at kernel/sched/core.c:5053
> (inlined by) context_switch at kernel/sched/core.c:5195
> (inlined by) __schedule at kernel/sched/core.c:6561
>
> split dwarf build
> $ ./scripts/faddr2line vmlinux __schedule+0x314
> aarch64-linux-gnu-addr2line: DWARF error: could not find abbrev number 860923
> __schedule+0x314/0x780:
> aarch64-linux-gnu-addr2line: DWARF error: could not find abbrev number 860923
> __schedule at core.c:?
>
> I'd tend to agree build time/space savings aren't worth the developer
> time.
> )
>
> Anyway, address sanitizer used to have a kasan_symbolize.py script but
> it looks like it got removed as no longer maintained, and I'm not sure
> what's a good tool to just run these logs through nowadays, might want
> to ask other test projects folks what they use...
>
> > > If you still have the vmlinux binary from that build (or if you can
> > > rebuild with the same options), running this text through addr2line
> > > should not take you too long.
> >
> > Please find build artifacts in this link,
> > - config
> > - vmlinux
> > - System.map
> > https://people.linaro.org/~anders.roxell/next-20221130-allmodconfig-arm64-tuxmake-build/
>
> So from the disassembly...
>
> - p9_client_cb+0x84 is right before the wake_up and after the wmb(), so
> I assume we're on writing req->status line 441:
>
> ---
> p9_client_cb(...)
> {
> ...
> smp_wmb();
> req->status = status;
>
> wake_up(&req->wq);
> ---
>
> report is about a write from 2 to 3, this makes sense we're going from
> REQ_STATUS_SENT (2) to REQ_STATUS_RCVD (3).
>
>
> - p9_client_rpc+0x1d0 isn't as simple to pin down as I'm having a hard
> time making sense of the kcsan instrumentations...
> The report is talking about a READ of 4 bytes at the same address, so
> I'd expect to see an ccess to req->status (and we're likely spot on
> wait_event_killable which checks req->status), but this doesn't seem to
> match up with the assembly: here's the excerpt from disass around 0x1d0
> = 464 (why doesn't gdb provide hex offsets..)
> ---
> 0xffff80000a46e9b8 <+440>: cmn w28, #0x200
> 0xffff80000a46e9bc <+444>: ccmn w28, #0xe, #0x4, ne // ne = any
> 0xffff80000a46e9c0 <+448>: b.eq 0xffff80000a46ecfc <p9_client_rpc+1276> // b.none
> 0xffff80000a46e9c4 <+452>: mov x0, x25
> 0xffff80000a46e9c8 <+456>: bl 0xffff800008543640 <__tsan_write4>
> 0xffff80000a46e9cc <+460>: mov w0, #0x2 // #2
> 0xffff80000a46e9d0 <+464>: str w0, [x21, #88]
> 0xffff80000a46e9d4 <+468>: b 0xffff80000a46ecfc <p9_client_rpc+1276>
> 0xffff80000a46e9d8 <+472>: mov w27, #0x1 // #1
> 0xffff80000a46e9dc <+476>: mov x0, x23
> 0xffff80000a46e9e0 <+480>: mov w1, #0x2bc // #700
> 0xffff80000a46e9e4 <+484>: bl 0xffff800008192d80 <__might_sleep>
> ---
>
> +464 is a write to x21 (client 'c', from looking at how it is passed
> into x0 for other function calls) at offset 88 (status field according
> to dwarf infos from a rebuild with your config/same sources)
>
> So, err, I'm a bit lost on this side.
> But I can't really find a problem with what KCSAN complains about --
> we are indeed accessing status from two threads without any locks.
> Instead of a lock, we're using a barrier so that:
> - recv thread/cb: writes to req stuff || write to req status
> - p9_client_rpc: reads req status || reads other fields from req
>
> Which has been working well enough (at least, without the barrier things
> blow up quite fast).
>
> So can I'll just consider this a false positive, but if someone knows
> how much one can read into this that'd be appreciated.
The barriers only ensure ordering, but not atomicity of the accesses
themselves (for one, the compiler is well in its right to transform
plain accesses in ways that the concurrent algorithm wasn't designed
for). In this case it looks like it's just missing
READ_ONCE()/WRITE_ONCE().
A (relatively) quick primer on the kernel's memory model and
where/what/how we need to "mark" accesses:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt
Thanks,
-- Marco