Re: [RFC PATCH v2 0/7] Defer throttle when task exits to user
From: K Prateek Nayak
Date: Tue Apr 15 2025 - 11:49:40 EST
Hello Jan,
Sorry for the noise.
On 4/15/2025 4:46 PM, K Prateek Nayak wrote:
Hello Jan,
On 4/15/2025 3:51 PM, Jan Kiszka wrote:
Is this in line with what you are seeing?
Yes, and if you wait a bit longer for the second reporting round, you
should get more task backtraces as well.
So looking at the backtrace [1], Aaron's patch should help with the
stalls you are seeing.
timerfd that queues a hrtimer also uses ep_poll_callback() to wakeup
the epoll waiter which queues ahead of the bandwidth timer and
requires the read lock but now since the writer tried to grab the
lock pushing readers on the slowpath. if epoll-stall-writer is now
throttled, it needs ktimer to replenish its bandwidth which cannot
happen without it grabbing the read lock first.
# epoll-stall-writer
So I got confused between "epoll-stall" and "epoll-stall-writer" here.
Turns out the actual series of events (based on traces, and hopefully
correct this time) are slightly longer. The correct series of events
are:
# epoll-stall-writer
anon_pipe_write()
__wake_up_common()
ep_poll_callback() {
read_lock_irq(&ep->lock) /* Read lock acquired here */
__wake_up_common()
ep_autoremove_wake_function()
try_to_wake_up() /* Wakes up "epoll-stall" */
preempt_schedule()
...
# "epoll-stall-writer" has run out of bandwidth, needs replenish to run
# sched_switch: "epoll-stall-writer" => "epoll-stall"
... /* Resumes from epoll_wait() */
epoll_wait() => 1 /* Write to FIFO */
read() /* Reads one byte of data */
epoll_wait()
write_lock_irq() /* Tries to grab write lock; "epoll-stall-writer" still has read lock */
schedule_rtlock() /* Sleeps but put next readers on slowpath */
...
# sched_switch: "epoll-stall" => "swapper"
# CPU is idle
...
# Timer interrupt schedules ktimers
# sched_switch: "swapper" => "ktimers"
hrtimer_run_softirq()
timerfd_tmrproc()
__wake_up_common()
ep_poll_callback() {
read_lock_irq(&ep->lock) /* Blocks since we are in rwlock slowpath */
schedule_rtlock()
...
# sched_switch: "ktimers" => "swapper"
# Bandwidth replenish never happens
# Stall
From a second look at trace, this should be the right series of
events since "epoll-stall-writer" with bandwidth control seems
to have cut off during while doing the wakeup and hasn't run
again.
Sorry for the noise.
[..snip..]
[1] https://lore.kernel.org/all/62304351-7fc0-48b6-883b-d346886dac8e@xxxxxxx/
Jan
--
Thanks and Regards,
Prateek