TLS zerocopy sendfile offset causes data corruption

From: Adrien Moulin
Date: Fri Mar 03 2023 - 07:16:51 EST


Hi,

When doing a sendfile call on a TLS_TX_ZEROCOPY_RO-enabled socket with an offset that is neither zero nor 4k-aligned, and with a "count" bigger than a single TLS record, part of the data received will be corrupted.

I am seeing this on 5.19 and 6.2.1 (x86_64) with a ConnectX-6 Dx NIC, with TLS NIC offload including sendfile otherwise working perfectly when not using TLS_TX_ZEROCOPY_RO.
I have a simple reproducer program available here https://gist.github.com/elyosh/922e6c15f8d4d7102c8ac9508b0cdc3b

Doing sendfile of a 32K file with a 8 bytes offset, first without zerocopy :

# ./ktls_test -i testfile -p 443 -c cert.pem -k key.pem -o 8
Serving file testfile, will send 32760 bytes (8 - 32768) with SHA1 sum 83fc1e3900cf900025311f2c27378a357f9f4d2c
sendfile(5, 3, 8, 32760) = 32760

% wget -S -q -O test_copy https://xxxxxx/; shasum test_copy
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 32760
X-Source-SHA1: 83fc1e3900cf900025311f2c27378a357f9f4d2c
83fc1e3900cf900025311f2c27378a357f9f4d2c test_copy

Same with TLS_TX_ZEROCOPY_RO enabled, received data will be corrupted :

# ./ktls_test -i testfile -p 443 -c cert.pem -k key.pem -o 8 -z
Serving file testfile, will send 32760 bytes (8 - 32768) with SHA1 sum 83fc1e3900cf900025311f2c27378a357f9f4d2c
TLS_TX_ZEROCOPY_RO enabled
sendfile(5, 3, 8, 32760) = 32760

% wget -S -q -O test_zerocopy https://xxxxxx/; shasum test_zerocopy
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 32760
X-Source-SHA1: 83fc1e3900cf900025311f2c27378a357f9f4d2c
03374f669f98d5f56837660a3817ce1d2a2819f8 test_zerocopy

% diff -U 1 -d <(xxd test_copy) <(xxd test_zerocopy)
--- /dev/fd/11 2023-03-03 10:13:26
+++ /dev/fd/12 2023-03-03 10:13:26
@@ -1087,3 +1087,3 @@
000043e0: 1010 1010 1010 1010 1010 1010 1010 1010 ................
-000043f0: 1010 1010 1010 1010 1111 1111 1111 1111 ................
+000043f0: 1010 1010 1010 1010 1010 1010 1010 1010 ................
00004400: 1111 1111 1111 1111 1111 1111 1111 1111 ................
@@ -1151,3 +1151,3 @@
000047e0: 1111 1111 1111 1111 1111 1111 1111 1111 ................
-000047f0: 1111 1111 1111 1111 1212 1212 1212 1212 ................
+000047f0: 1111 1111 1111 1111 1111 1111 1111 1111 ................
00004800: 1212 1212 1212 1212 1212 1212 1212 1212 ................
@@ -1215,3 +1215,3 @@
00004be0: 1212 1212 1212 1212 1212 1212 1212 1212 ................
-00004bf0: 1212 1212 1212 1212 1313 1313 1313 1313 ................
+00004bf0: 1212 1212 1212 1212 1212 1212 1212 1212 ................
00004c00: 1313 1313 1313 1313 1313 1313 1313 1313 ................

For context, I noticed this issue trying to serve cached files with nginx. For static files this works fine (sendfile offset is zero at first, then 16k-aligned), but cached files are stored with a ~500 bytes header that is skipped in the sendfile call, triggering this issue.

Best regards

--
Adrien Moulin