Re: BUG: KCSAN: data-race in tick_nohz_next_event / tick_nohz_stop_tick

From: Thomas Gleixner
Date: Sat Dec 05 2020 - 18:48:13 EST


On Sat, Dec 05 2020 at 19:18, Thomas Gleixner wrote:
> On Fri, Dec 04 2020 at 20:53, Marco Elver wrote:
> It might be useful to find the actual variable, data member or whatever
> which is involved in the various reports and if there is a match then
> the reports could be aggregated. The 3 patterns here are not even the
> complete possible picture.
>
> So if you sum them up: 58 + 148 + 205 instances then their weight
> becomes more significant as well.

I just looked into the moderation queue and picked stuff which I'm
familiar with from the subject line.

There are quite some reports which have a different trigger scenario,
but are all related to the same issue.

https://syzkaller.appspot.com/bug?id=f5a5ed5b2b6c3e92bc1a9dadc934c44ee3ba4ec5
https://syzkaller.appspot.com/bug?id=36fc4ad4cac8b8fc8a40713f38818488faa9e9f4

are just variations of the same problem timer_base->running_timer being
set to NULL without holding the base lock. Safe, but insanely hard to
explain why :)

Next:

https://syzkaller.appspot.com/bug?id=e613fc2458de1c8a544738baf46286a99e8e7460
https://syzkaller.appspot.com/bug?id=55bc81ed3b2f620f64fa6209000f40ace4469bc0
https://syzkaller.appspot.com/bug?id=972894de81731fc8f62b8220e7cd5153d3e0d383
.....

That's just the ones which caught my eye and all are related to
task->flags usage. There are tons more judging from the subject
lines.

So you really want to look at them as classes of problems and not as
individual scenarios.

Thanks,

tglx