[PATCH 4.9 22/72] tcp: provide timestamps for partial writes

From: Greg Kroah-Hartman
Date: Sun Nov 19 2017 - 09:40:10 EST


4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx>


[ Upstream commit ad02c4f547826167a709dab8a89a1caefd2c1f50 ]

For TCP sockets, TX timestamps are only captured when the user data
is successfully and fully written to the socket. In many cases,
however, TCP writes can be partial for which no timestamp is
collected.

Collect timestamps whenever any user data is (fully or partially)
copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
instead of the local skb pointer since it can be set to NULL on
the error path.

Note that tcp_write_queue_tail can be NULL, even if bytes have been
copied to the socket. This is because acknowledgements are being
processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
called tcp_write_queue_tail can be NULL. For such cases, this patch
does not collect any timestamps (i.e., it is best-effort).

This patch is written with suggestions from Willem de Bruijn and
Eric Dumazet.

Change-log V1 -> V2:
- Use sockc.tsflags instead of sk->sk_tsflags.
- Use the same code path for normal writes and errors.

Signed-off-by: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx>
Acked-by: Yuchung Cheng <ycheng@xxxxxxxxxx>
Cc: Willem de Bruijn <willemb@xxxxxxxxxx>
Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
Cc: Neal Cardwell <ncardwell@xxxxxxxxxx>
Cc: Martin KaFai Lau <kafai@xxxxxx>
Acked-by: Willem de Bruijn <willemb@xxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
net/ipv4/tcp.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -431,7 +431,7 @@ EXPORT_SYMBOL(tcp_init_sock);

static void tcp_tx_timestamp(struct sock *sk, u16 tsflags, struct sk_buff *skb)
{
- if (tsflags) {
+ if (tsflags && skb) {
struct skb_shared_info *shinfo = skb_shinfo(skb);
struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);

@@ -966,10 +966,8 @@ new_segment:
copied += copy;
offset += copy;
size -= copy;
- if (!size) {
- tcp_tx_timestamp(sk, sk->sk_tsflags, skb);
+ if (!size)
goto out;
- }

if (skb->len < size_goal || (flags & MSG_OOB))
continue;
@@ -995,8 +993,11 @@ wait_for_memory:
}

out:
- if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
- tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+ if (copied) {
+ tcp_tx_timestamp(sk, sk->sk_tsflags, tcp_write_queue_tail(sk));
+ if (!(flags & MSG_SENDPAGE_NOTLAST))
+ tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+ }
return copied;

do_error:
@@ -1289,7 +1290,6 @@ new_segment:

copied += copy;
if (!msg_data_left(msg)) {
- tcp_tx_timestamp(sk, sockc.tsflags, skb);
if (unlikely(flags & MSG_EOR))
TCP_SKB_CB(skb)->eor = 1;
goto out;
@@ -1320,8 +1320,10 @@ wait_for_memory:
}

out:
- if (copied)
+ if (copied) {
+ tcp_tx_timestamp(sk, sockc.tsflags, tcp_write_queue_tail(sk));
tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
+ }
out_nopush:
release_sock(sk);
return copied + copied_syn;