On Sun, Jan 22, 2023 at 04:24:21PM +0100, Hernan Ponce de Leon wrote:
On 1/20/2023 4:54 PM, Paul E. McKenney wrote:
On Fri, Jan 20, 2023 at 06:58:20AM -0800, Arjan van de Ven wrote:
On 1/20/2023 5:55 AM, Hernan Ponce de Leon wrote:
From: Hernan Ponce de Leon <hernanl.leon@xxxxxxxxxx>
kernel/locking/rtmutex.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 010cf4e6d0b8..7ed9472edd48 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -235,7 +235,7 @@ static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base *lock)
unsigned long owner, *p = (unsigned long *) &lock->owner;
do {
- owner = *p;
+ owner = READ_ONCE(*p);
} while (cmpxchg_relaxed(p, owner,
I don't see how this makes any difference at all.
*p can be read a dozen times and it's fine; cmpxchg has barrier semantics for compilers afaics
Doing so does suppress a KCSAN warning. You could also use data_race()
if it turns out that the volatile semantics would prevent a valuable
compiler optimization.
I think the import question is "is this a harmful data race (and needs to be
fixed as proposed by the patch) or a harmless one (and we should use
data_race() to silence tools)?".
In https://lkml.org/lkml/2023/1/22/160 I describe how this data race can
affect important ordering guarantees for the rest of the code. For this
reason I consider it a harmful one. If this is not the case, I would
appreciate some feedback or pointer to resources about what races care to
avoid spamming the mailing list in the future.
In the case, the value read is passed into cmpxchg_relaxed(), which
checks the value against memory. In this case, as Arjan noted, the only
compiler-and-silicon difference between data_race() and READ_ONCE()
is that use of data_race() might allow the compiler to do things like
tear the load, thus forcing the occasional spurious cmpxchg_relaxed()
failure. In contrast, LKMM (by design) throws up its hands when it sees
a data race. Something about not being eager to track the idiosyncrasies
of many compiler versions.
My approach in my own code is to use *_ONCE() unless it causes a visible
performance regression or if it confuses KCSAN. An example of the latter
can be debug code, in which case use of data_race() avoids suppressing
KCSAN warnings (and also false positives, depending).
Except that your other email seems to also be arguing that additional
ordering is required. So is https://lkml.org/lkml/2023/1/20/702 really
sufficient just by itself, or is additional ordering required?