[PATCHSET RFC v2 0/5] Add support for epoll min_wait

From: Jens Axboe
Date: Fri Oct 28 2022 - 17:43:39 EST


Hi,

tldr - we saw a 6-7% CPU reduction with this patch. See patch 4 for
full numbers.

This adds support for EPOLL_CTL_MIN_WAIT, which allows setting a minimum
time that epoll_wait() should wait for events on a given epoll context.
Some justification and numbers are in patch 5, patches 1-4 are really
just prep patches.

Sending this as an RFC to hash out the API, basically. This is
obviously a per-context type of operation in this patchset, which isn't
necessarily ideal for any use case. Questions to be debated:

1) Would we want this to be available through epoll_wait() directly?
That would allow this to be done on a per-epoll_wait() basis, rather
than be tied to the specific context.

2) If the answer to #1 is yes, would we still want EPOLL_CTL_MIN_WAIT?

I think there are pros and cons to both, and perhaps the answer to both
is "yes". There are some benefits to doing this at epoll setup time,
for example - it nicely isolates it to that part rather than needing
to be done dynamically everytime epoll_wait() is called. This also
helps the application code, as it can turn off any busy'ness tracking
based on if the setup accepted EPOLL_CTL_MIN_WAIT or not.

Anyway, tossing this out there as it yielded quite good results in
some initial testing, we're running more of it. Not a lot of changes
since v1, but it was posted in the middle of the merge window. Hoping
to get some more discussion this time around, or at least some...

Also available here:

https://git.kernel.dk/cgit/linux-block/log/?h=epoll-min_ts

Since v1:
- Split patch 4 a bit, to make the meat of the changes smaller
- Get rid of EPOLL_DEF_MIN_WAIT
- Rebase on current -git master

--
Jens Axboe