Re: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT

From: John Ogness
Date: Thu Aug 26 2021 - 16:30:11 EST


On 2021-08-26, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:
> On 2021-08-25 15:27:54 [+0200], Frederic Weisbecker wrote:
>> Hi,
>>
>> Ok the patch is gross but at least this lets me start a discussion
>> about the issue.
>>
>> ---
>> From d9d66d650b3dac8947a34464dd2e0b546a8c6b63 Mon Sep 17 00:00:00 2001
>> From: Frederic Weisbecker <frederic@xxxxxxxxxx>
>> Date: Wed, 25 Aug 2021 14:24:54 +0200
>> Subject: [RFC PATCH -RT] epoll: Fix eventpoll read-lock not writer-fair in PREEMPT_RT
>>
>> The eventpoll lock has been converted to an rwlock some time ago with:
>>
>> a218cc491420 (epoll: use rwlock in order to reduce ep_poll
>> callback() contention)
>>
>> Unfortunately this can result in scenarios where a high priority caller
>> of epoll_wait() need to wait for the completion of lower priority wakers.
>>
>> The typical scenario is:
>>
>> 1) epoll_wait() waits and sleeps for new events in the ep_poll() loop.
>>
>> 2) new events arrive in ep_poll_callback(), the waiter is awaken while
>> ep->lock is read-acquired.
>>
>> 3) The high priority waiter preempts the waker but it can't acquire the
>> write lock in epoll_wait() so it blocks waiting for the low prio waker
>> without priority inheritance.
>>
>> I guess making readlock writer fair is still not the plan so all I can
>> propose is to make that rwlock build-conditional.
>
> It is writer fair in a sense that once a writer attempts to acquire
> the lock no new reader are allowed in.
>
> What you want is that the writer pi-boosts each reader which is what
> is not done (multi reader boost). Long ago there was an attempt to
> make this happen (I think with rwsem) but it turned out to be
> problematic. There was a workaround by only allowing one reader and
> doing PI as usual.

This patch is essentially forcing that exact workaround for eventpoll.

John Ogness