Re: [PATCH v3 7/7] locking/rtmutex: Acquire the hb lock via trylock after wait-proxylock.
From: Jiri Slaby
Date: Mon Jan 15 2024 - 07:54:45 EST
On 15. 01. 24, 12:52, Jiri Slaby wrote:
On 15. 01. 24, 12:40, Jiri Slaby wrote:
On 15. 09. 23, 17:19, Peter Zijlstra wrote:
On Fri, Sep 15, 2023 at 02:58:35PM +0200, Thomas Gleixner wrote:
I spent quite some time to convince myself that this is correct. I was
not able to poke a hole into it. So that really should be safe to
do. Famous last words ...
IKR :-/
Something like so then...
---
Subject: futex/pi: Fix recursive rt_mutex waiter state
So this breaks some random test in APR:
From
https://build.opensuse.org/package/live_build_log/openSUSE:Factory:Staging:G/apr/standard/x86_64:
testprocmutex : Line 122: child did not terminate with success
The child in fact terminates on
https://github.com/apache/apr/blob/trunk/test/testprocmutex.c#L93:
while ((rv = apr_proc_mutex_timedlock(proc_lock, 1))) {
if (!APR_STATUS_IS_TIMEUP(rv))
exit(1); <----- here
The test creates 6 children and does some
pthread_mutex_timedlock/unlock() repeatedly (200 times) in parallel
while sleeping 1 us inside the lock. The timeout is 1 us above. And
the test expects all them to fail (to time out). But the time out does
not always happen in 6.7 (it's racy, so the failure is semi-random:
like 1 of 1000 attempts is bad).
This is not precise as I misinterpreted. The test is: either it succeeds
or times out.
But since the commit, futex() yields 22/EINVAL, i.e. fails.
A simplified reproducer attached (in particular, no APR anymore). Build
with -pthread, obviously. If you see
BADx rv=22
that's bad.
regards,
--
js
suse labs
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/time.h>
#include <sys/wait.h>
#define MAX_WAIT_USEC (1000*1000)
#define CHILDREN 16
#define MAX_ITER 200
#define NS_PER_S 1000000000
static pthread_mutex_t *proc_lock;
static void child()
{
int rv, i = 0;
do {
int wait_usec = 0;
struct timespec abstime;
clock_gettime(CLOCK_REALTIME, &abstime);
abstime.tv_nsec += 1000;
if (abstime.tv_nsec >= NS_PER_S) {
abstime.tv_sec++;
abstime.tv_nsec -= NS_PER_S;
}
while ((rv = pthread_mutex_timedlock(proc_lock, &abstime))) {
if (rv != ETIMEDOUT) {
fprintf(stderr, "BADx rv=%d\n", rv);
abort();
}
if (++wait_usec >= MAX_WAIT_USEC)
abort();
}
//fprintf(stderr, "[%d] rv=%d\n", getpid(), rv);
i++;
usleep(1);
if (pthread_mutex_unlock(proc_lock))
abort();
} while (i < MAX_ITER);
exit(0);
}
int main(int argc, char **argv)
{
proc_lock = mmap(NULL, sizeof(*proc_lock),
PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED,
-1, 0);
pthread_mutexattr_t mattr;
pthread_mutexattr_init(&mattr);
pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
pthread_mutexattr_setrobust(&mattr, PTHREAD_MUTEX_ROBUST);
pthread_mutexattr_setprotocol(&mattr, PTHREAD_PRIO_INHERIT);
pthread_mutex_init(proc_lock, &mattr);
pthread_mutexattr_destroy(&mattr);
for (unsigned a = 0; a < CHILDREN; a++)
if (!fork())
child();
for (unsigned a = 0; a < CHILDREN; a++)
wait(NULL);
pthread_mutex_destroy(proc_lock);
return 0;
}