[PATCH] af_unix: utilize skb's fragment list for sending large datagrams

From: Jan Dakinevich
Date: Thu Aug 22 2019 - 09:45:31 EST


When somebody tries to send big datagram, kernel makes an attempt to
avoid high-order allocation placing it into both: skb's data buffer
and skb's paged part (->frag).

However, paged part can not exceed MAX_SKB_FRAGS * PAGE_SIZE, and large
datagram causes increasing skb's data buffer. Thus, if any user-space
program sets send buffer (by calling setsockopt(SO_SNDBUF, ...)) to
maximum allowed size (wmem_max) it becomes able to cause any amount
of uncontrolled high-order kernel allocations.

To avoid this, do not pass more then SKB_MAX_ALLOC for skb's data
buffer and make use of fragment list of skb (->frag_list) in addition
to paged part for huge datagrams.

Signed-off-by: Jan Dakinevich <jan.dakinevich@xxxxxxxxxxxxx>
---
net/unix/af_unix.c | 38 +++++++++++++++++++++++++++-----------
1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 67e87db..0c13937 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1580,7 +1580,9 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
struct sk_buff *skb;
long timeo;
struct scm_cookie scm;
- int data_len = 0;
+ unsigned long frag_len;
+ unsigned long paged_len;
+ unsigned long header_len;
int sk_locked;

wait_for_unix_gc();
@@ -1613,27 +1615,41 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
if (len > sk->sk_sndbuf - 32)
goto out;

- if (len > SKB_MAX_ALLOC) {
- data_len = min_t(size_t,
- len - SKB_MAX_ALLOC,
- MAX_SKB_FRAGS * PAGE_SIZE);
- data_len = PAGE_ALIGN(data_len);
+ BUILD_BUG_ON(SKB_MAX_ALLOC < PAGE_SIZE);

- BUILD_BUG_ON(SKB_MAX_ALLOC < PAGE_SIZE);
- }
+ header_len = min(len, SKB_MAX_ALLOC);
+ paged_len = min(len - header_len, MAX_SKB_FRAGS * PAGE_SIZE);
+ frag_len = len - header_len - paged_len;

- skb = sock_alloc_send_pskb(sk, len - data_len, data_len,
+ skb = sock_alloc_send_pskb(sk, header_len, paged_len,
msg->msg_flags & MSG_DONTWAIT, &err,
PAGE_ALLOC_COSTLY_ORDER);
if (skb == NULL)
goto out;

+ while (frag_len) {
+ unsigned long size = min(SKB_MAX_ALLOC, frag_len);
+ struct sk_buff *frag;
+
+ frag = sock_alloc_send_pskb(sk, size, 0,
+ msg->msg_flags & MSG_DONTWAIT,
+ &err, 0);
+ if (!frag)
+ goto out_free;
+
+ skb_put(frag, size);
+ frag->next = skb_shinfo(skb)->frag_list;
+ skb_shinfo(skb)->frag_list = frag;
+
+ frag_len -= size;
+ }
+
err = unix_scm_to_skb(&scm, skb, true);
if (err < 0)
goto out_free;

- skb_put(skb, len - data_len);
- skb->data_len = data_len;
+ skb_put(skb, header_len);
+ skb->data_len = len - header_len;
skb->len = len;
err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, len);
if (err)
--
2.1.4