Re: [PATCH] netdev: add netdev_pagefrag_enabled sysctl

From: David Miller
Date: Sat Nov 11 2017 - 05:20:10 EST


From: Hongbo Li <herbert.tencent@xxxxxxxxx>
Date: Thu, 9 Nov 2017 16:12:27 +0800

> From: Hongbo Li <herberthbli@xxxxxxxxxxx>
>
> This patch solves a memory frag issue when allocating skb.
> I found this issue in a udp scenario, here is my test model:
> 1. About five hundreds udp threads listen on server,
> and five hundreds client threads send udp pkts to them.
> Some threads send pkts in a faster speed than others.
> 2. The user processes on server don't have enough ability
> to receive these pkts.
>
> Then I got following result:
> 1. Some udp sockets' recv-q reach the queue's limit, others
> not because of the global rmem limit.
> 2. The "free" command shows "used" memory is more than 62GB.
> But cat /proc/net/sockstat shows that udp uses only 12GB.
>
> This will confused the user that why the system consumes so
> many memory.This is caused by the memory frags in netdev layer.
> __netdev_alloc_frag() allocs a page block which has 8 pages.
>
> Then in this scenario, most skbs are freed when the recv-q
> is full, but if any skb in the same page block be queued to
> other recv-q which is not full, the whole page block can't
> be freed.
>
> So from the view of kernel, these pages are used, but from
> the view of tcp/udp, only the skbs in recv-q are used.
>
> To avoid exhausting memory in such scenario, I add a sysctl
> to make user can disable allocating skbs in page frag.
>
> Signed-off-by: Hongbo Li <herberthbli@xxxxxxxxxxx>

When something like page fragments don't work properly, we fix
them rather then providing a way to disable them.

Thank you.