Re: Kernel Concurrency Sanitizer (KCSAN)

From: Eric Dumazet
Date: Wed Oct 09 2019 - 12:39:47 EST




On 10/9/19 12:45 AM, Dmitry Vyukov wrote:
> On Sat, Oct 5, 2019 at 6:16 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>
>> On Sat, Oct 5, 2019 at 2:58 AM Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
>>>> This one is tricky. What I think we need to avoid is an onslaught of
>>>> patches adding READ_ONCE/WRITE_ONCE without a concrete analysis of the
>>>> code being modified. My worry is that Joe Developer is eager to get their
>>>> first patch into the kernel, so runs this tool and starts spamming
>>>> maintainers with these things to the point that they start ignoring KCSAN
>>>> reports altogether because of the time they take up.
>>>>
>>>> I suppose one thing we could do is to require each new READ_ONCE/WRITE_ONCE
>>>> to have a comment describing the racy access, a bit like we do for memory
>>>> barriers. Another possibility would be to use atomic_t more widely if
>>>> there is genuine concurrency involved.
>>>>
>>>
>>> About READ_ONCE() and WRITE_ONCE(), we will probably need
>>>
>>> ADD_ONCE(var, value) for arches that can implement the RMW in a single instruction.
>>>
>>> WRITE_ONCE(var, var + value) does not look pretty, and increases register pressure.
>>
>> FWIW modern compilers can handle this if we tell them what we are trying to do:
>>
>> void foo(int *p, int x)
>> {
>> x += __atomic_load_n(p, __ATOMIC_RELAXED);
>> __atomic_store_n(p, x, __ATOMIC_RELAXED);
>> }
>>
>> $ clang test.c -c -O2 && objdump -d test.o
>>
>> 0000000000000000 <foo>:
>> 0: 01 37 add %esi,(%rdi)
>> 2: c3 retq
>>
>> We can have syntactic sugar on top of this of course.
>
> An interesting precedent come up in another KCSAN bug report. Namely,
> it may be reasonable for a compiler to use different optimization
> heuristics for concurrent and non-concurrent code. Consider there are
> some legal code transformations, but it's unclear if they are
> profitable or not. It may be the case that for non-concurrent code the
> expectation is that it's a profitable transformation, but for
> concurrent code it is not. So that may be another reason to
> communicate to compiler what we want to do, rather than trying to
> trick and play against each other. I've added the concrete example
> here:
> https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve-performance
>

Note that for bit fields, READ_ONCE() wont work.

Concrete example in net/xfrm/xfrm_algo.c:xfrm_probe_algs(void)
...
if (aalg_list[i].available != status)
aalg_list[i].available = status;
...
if (ealg_list[i].available != status)
ealg_list[i].available = status;
...
if (calg_list[i].available != status)
calg_list[i].available = status;