Re: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT

From: Sebastian Andrzej Siewior
Date: Thu Aug 26 2021 - 07:53:45 EST

On 2021-08-25 15:27:54 [+0200], Frederic Weisbecker wrote:
> Hi,
> Ok the patch is gross but at least this lets me start a discussion
> about the issue.
> ---
> From d9d66d650b3dac8947a34464dd2e0b546a8c6b63 Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Date: Wed, 25 Aug 2021 14:24:54 +0200
> Subject: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
> The eventpoll lock has been converted to an rwlock some time ago with:
> a218cc491420 (epoll: use rwlock in order to reduce ep_poll
> callback() contention)
> Unfortunately this can result in scenarios where a high priority caller
> of epoll_wait() need to wait for the completion of lower priority wakers.
> The typical scenario is:
> 1) epoll_wait() waits and sleeps for new events in the ep_poll() loop.
> 2) new events arrive in ep_poll_callback(), the waiter is awaken while
> ep->lock is read-acquired.
> 3) The high priority waiter preempts the waker but it can't acquire the
> write lock in epoll_wait() so it blocks waiting for the low prio waker
> without priority inheritance.
> I guess making readlock writer fair is still not the plan so all I can
> propose is to make that rwlock build-conditional.

It is writer fair in a sense that once a writer attempts to acquire the
lock no new reader are allowed in.
What you want is that the writer pi-boosts each reader which is what is
not done (multi reader boost). Long ago there was an attempt to make
this happen (I think with rwsem) but it turned out to be problematic.
There was a workaround by only allowing one reader and doing PI as
This was then dropped because multi-reader became a must have thing for
other reasons and in the meantime the lack of pi-boosting wasn't that
*problematic* anymore. The problematic user converted in the meantime to
RCU having the reading side lockless and the writer had a regular lock.

> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>