[PATCH v2 00/13] optimise registered buffer/file updates

From: Pavel Begunkov
Date: Tue Apr 04 2023 - 08:40:56 EST


The patchset optimises registered files and buffers updates / removals,
The rsrc-update-bench test showes 11x improvement (1040K -> 11468K
updates / sec). It also improves latency by eliminating rcu grace
period waiting and bouncing it to another worker, and reduces
memory footprint by removing percpu refs.

That's quite important for apps updating files/buffers with medium or
higher frequency as updates are slow and expensive, and it currently
takes quite a number of IO requests per update to make using fixed
files/buffers worthwhile.

Another upside is that it makes it simpler, patch 9 removes very
convoluted synchronisation via flush_delayed_work() from the quiesce
path.

v2: rebase, add patches 12 and 13 to remove the last pair atomics out
of the path and to limit caching.

Pavel Begunkov (13):
io_uring/rsrc: use non-pcpu refcounts for nodes
io_uring/rsrc: keep cached refs per node
io_uring: don't put nodes under spinlocks
io_uring: io_free_req() via tw
io_uring/rsrc: protect node refs with uring_lock
io_uring/rsrc: kill rsrc_ref_lock
io_uring/rsrc: rename rsrc_list
io_uring/rsrc: optimise io_rsrc_put allocation
io_uring/rsrc: don't offload node free
io_uring/rsrc: cache struct io_rsrc_node
io_uring/rsrc: add lockdep sanity checks
io_uring/rsrc: optimise io_rsrc_data refcounting
io_uring/rsrc: add custom limit for node caching

include/linux/io_uring_types.h | 8 +-
io_uring/alloc_cache.h | 6 +-
io_uring/io_uring.c | 54 ++++++----
io_uring/rsrc.c | 176 ++++++++++++---------------------
io_uring/rsrc.h | 58 +++++------
5 files changed, 136 insertions(+), 166 deletions(-)

--
2.39.1