[PATCH net-next 00/21] rxrpc: Miscellaneous changes and make use of MSG_SPLICE_PAGES
From: David Howells
Date: Fri Mar 01 2024 - 11:38:58 EST
Here are some changes to AF_RXRPC:
(1) Cache the transmission serial number of ACK and DATA packets in the
rxrpc_txbuf struct and log this in the retransmit tracepoint.
(2) Don't use atomics on rxrpc_txbuf::flags[*] and cache the intended wire
header flags there too to avoid duplication.
(3) Cache the wire checksum in rxrpc_txbuf to make it easier to create
jumbo packets in future (which will require altering the wire header
to a jumbo header and restoring it back again for retransmission).
(4) Fix the protocol names in the wire ACK trailer struct.
(5) Strip all the barriers and atomics out of the call timer tracking[*].
(6) Remove atomic handling from call->tx_transmitted and
call->acks_prev_seq[*].
(7) Don't bother resetting the DF flag after UDP packet transmission. To
change it, we now call directly into UDP code, so it's quick just to
set it every time.
(8) Merge together the DF/non-DF branches of the DATA transmission to
reduce duplication in the code.
(9) Add a kvec array into rxrpc_txbuf and start moving things over to it.
This paves the way for using page frags.
(10) Split (sub)packet preparation and timestamping out of the DATA
transmission function. This helps pave the way for future jumbo
packet generation.
(11) In rxkad, don't pick values out of the wire header stored in
rxrpc_txbuf, buf rather find them elsewhere so we can remove the wire
header from there.
(12) Move rxrpc_send_ACK() to output.c so that it can be merged with
rxrpc_send_ack_packet().
(13) Use rxrpc_txbuf::kvec[0] to access the wire header for the packet
rather than directly accessing the copy in rxrpc_txbuf. This will
allow that to be removed to a page frag.
(14) Switch from keeping the transmission buffers in rxrpc_txbuf allocated
in the slab to allocating them using page fragment allocators. There
are separate allocators for DATA packets (which persist for a while)
and control packets (which are discarded immediately).
We can then turn on MSG_SPLICE_PAGES when transmitting DATA and ACK
packets.
We can also get rid of the RCU cleanup on rxrpc_txbufs, preferring
instead to release the page frags as soon as possible.
(15) Parse received packets before handling timeouts as the former may
reset the latter.
(16) Make sure we don't retransmit DATA packets after all the packets have
been ACK'd.
(17) Differentiate traces for PING ACK transmission.
(18) Switch to keeping timeouts as ktime_t rather than a number of jiffies
as the latter is too coarse a granularity. Only set the call timer at
the end of the call event function from the aggregate of all the
timeouts, thereby reducing the number of timer calls made. In future,
it might be possible to reduce the number of timers from one per call
to one per I/O thread and to use a high-precision timer.
(19) Record RTT probes after successful transmission rather than recording
it before and then cancelling it after if unsuccessful[*]. This
allows a number of calls to get the current time to be removed.
(20) Clean up the resend algorithm as there's now no need to walk the
transmission buffer under lock[*]. DATA packets can be retransmitted
as soon as they're found rather than being queued up and transmitted
when the locked is dropped.
(21) When initially parsing a received ACK packet, extract some of the
fields from the ack info to the skbuff private data. This makes it
easier to do path MTU discovery in the future when the call to which a
PING RESPONSE ACK refers has been deallocated.
[*] Possible with the move of almost all code from softirq context to the
I/O thread.
The patches are tagged here:
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/rxrpc-iothread-20240301
And can be found on this branch:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-iothread
David
David Howells (21):
rxrpc: Record the Tx serial in the rxrpc_txbuf and retransmit trace
rxrpc: Convert rxrpc_txbuf::flags into a mask and don't use atomics
rxrpc: Note cksum in txbuf
rxrpc: Fix the names of the fields in the ACK trailer struct
rxrpc: Strip barriers and atomics off of timer tracking
rxrpc: Remove atomic handling on some fields only used in I/O thread
rxrpc: Do lazy DF flag resetting
rxrpc: Merge together DF/non-DF branches of data Tx function
rxrpc: Add a kvec[] to the rxrpc_txbuf struct
rxrpc: Split up the DATA packet transmission function
rxrpc: Don't pick values out of the wire header when setting up
security
rxrpc: Move rxrpc_send_ACK() to output.c with rxrpc_send_ack_packet()
rxrpc: Use rxrpc_txbuf::kvec[0] instead of rxrpc_txbuf::wire
rxrpc: Do zerocopy using MSG_SPLICE_PAGES and page frags
rxrpc: Parse received packets before dealing with timeouts
rxrpc: Don't permit resending after all Tx packets acked
rxrpc: Differentiate PING ACK transmission traces.
rxrpc: Use ktimes for call timeout tracking and set the timer lazily
rxrpc: Record probes after transmission and reduce number of time-gets
rxrpc: Clean up the resend algorithm
rxrpc: Extract useful fields from a received ACK to skb priv data
include/trace/events/rxrpc.h | 198 ++++++++--------
net/rxrpc/af_rxrpc.c | 12 +-
net/rxrpc/ar-internal.h | 88 ++++---
net/rxrpc/call_event.c | 327 ++++++++++++--------------
net/rxrpc/call_object.c | 56 ++---
net/rxrpc/conn_client.c | 4 +-
net/rxrpc/conn_event.c | 16 +-
net/rxrpc/conn_object.c | 4 +
net/rxrpc/input.c | 116 +++++----
net/rxrpc/insecure.c | 11 +-
net/rxrpc/io_thread.c | 11 +
net/rxrpc/local_object.c | 3 +
net/rxrpc/misc.c | 8 +-
net/rxrpc/output.c | 441 +++++++++++++++++------------------
net/rxrpc/proc.c | 10 +-
net/rxrpc/protocol.h | 6 +-
net/rxrpc/rtt.c | 36 +--
net/rxrpc/rxkad.c | 58 ++---
net/rxrpc/sendmsg.c | 63 ++---
net/rxrpc/sysctl.c | 16 +-
net/rxrpc/txbuf.c | 174 +++++++++++---
21 files changed, 853 insertions(+), 805 deletions(-)