Re: [RT BUG] Stall caused by eventpoll, rwlocks and CFS bandwidth controller
From: Jan Kiszka
Date: Tue Apr 15 2025 - 02:54:19 EST
On 15.04.25 08:23, Sebastian Andrzej Siewior wrote:
> On 2025-04-15 07:35:50 [+0200], Jan Kiszka wrote:
>>> On RT the read_lock() in the timer block, the write blocks, too. So
>>> every blocker on the lock is scheduled out until the reader is gone. On
>>> top of that, the reader gets RCU boosted with FIFO-1 by default to get
>>> out.
>>
>> There is no boosting of the active readers on RT as there is no
>> information recorded about who is currently holding a read lock. This is
>> the whole point why rwlocks are hairy with RT, I thought.
>
> Kind of, yes. PREEMPT_RT has by default RCU boosting enabled with
> SCHED_FIFO 1. If you acquire a readlock you start a RCU section. If you
> get stuck in a RCU section for too long then this boosting will take
> effect by making the task, within the RCU section, the owner of the
> boost-lock and the boosting task will try to acquire it. This is used to
> get SCHED_OTHER tasks out of the RCU section.
> But if a SCHED_FIFO task is on the CPU then this boosting will have to
> no effect because the scheduler will not switch to a task with lower
> priority.
Does that boosting happen to need ktimersd or ksoftirqd (which both are
stalling in our case)? I'm still looking for the reason why it does not
help in the observed stall scenarios.
Jan
--
Siemens AG, Foundational Technologies
Linux Expert Center