[PATCHSET v3 0/5] Add support for epoll min_wait

From: Jens Axboe
Date: Sun Oct 30 2022 - 18:02:27 EST


Hi,

tldr - we saw a 6-7% CPU reduction with this patch. See patch 6 for
full numbers.

This adds support for EPOLL_CTL_MIN_WAIT, which allows setting a minimum
time that epoll_wait() should wait for events on a given epoll context.
Some justification and numbers are in patch 6, patches 1-5 are really
just prep patches or cleanups.

Sending this out to get some input on the API, basically. This is
obviously a per-context type of operation in this patchset, which isn't
necessarily ideal for any use case. Questions to be debated:

1) Would we want this to be available through epoll_wait() directly?
That would allow this to be done on a per-epoll_wait() basis, rather
than be tied to the specific context.

2) If the answer to #1 is yes, would we still want EPOLL_CTL_MIN_WAIT?

I think there are pros and cons to both, and perhaps the answer to both is
"yes". There are some benefits to doing this at epoll setup time, for
example - it nicely isolates it to that part rather than needing to be
done dynamically everytime epoll_wait() is called. This also helps the
application code, as it can turn off any busy'ness tracking based on if
the setup accepted EPOLL_CTL_MIN_WAIT or not.

Anyway, tossing this out there as it yielded quite good results in some
initial testing, we're running more of it. Sending out a v3 now since
someone reported that nonblock issue which is annoying. Hoping to get some
more discussion this time around, or at least some...

Also available here:

https://git.kernel.dk/cgit/linux-block/log/?h=epoll-min_ts

Since v2:
- Fix an issue with nonblock event checking (timeout given, 0/0 set)
- Add another prep patch, getting rid of passing in a known 'false'
to ep_busy_loop()

--
Jens Axboe