Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next

From: Qian Cai
Date: Tue Jan 28 2020 - 07:53:05 EST




> On Jan 28, 2020, at 6:46 AM, Marco Elver <elver@xxxxxxxxxx> wrote:
>
> Qian: firstly I suggest you try
> CONFIG_KCSAN_REPORT_ONCE_IN_MS=1000000000 as mentioned before so your
> system doesn't get spammed, considering you do not use the default
> config but want to use all debugging tools at once which seems to
> trigger certain data races more than usual.

Yes, I had that. There are still many reports that I plan to look at them one by one. It takes so much time that cause systemd storage lookup timeouts and I needed to manually get out of the emergency shell.

>
> Secondly, what are your expectations? If you expect the situation to
> be perfect tomorrow, you'll be disappointed. This is inherent, given
> the problem we face (safe concurrency). Consider the various parts to
> this story: concurrent kernel code, the LKMM, people's preferences and
> opinions, and KCSAN (which is late to the party). All of them are
> still evolving, hopefully together. At least that's my expectation.

Iâll try to reduce splats as many as possible by any data_race(), disable the whole file or actually fix it. Any resolved splat will hurt the ability to find the real data races at some degrees.

>
> What to do about osq_lock here? If people agree that no further
> annotations are wanted, and the reasoning above concludes there are no
> bugs, we can blacklist the file. That would, however, miss new data
> races in future.

This is a question to locking maintainers. data_race() macro sounds reasonable to me, but blacklisted the file is still better than leaving it as-is.