Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change

From: SeongJae Park
Date: Tue May 05 2020 - 07:54:41 EST


CC-ing stable@xxxxxxxxxxxxxxx and adding some more explanations.

On Tue, 5 May 2020 10:10:33 +0200 SeongJae Park <sjpark@xxxxxxxxxx> wrote:

> From: SeongJae Park <sjpark@xxxxxxxxx>
>
> The commit 6d7855c54e1e ("sockfs: switch to ->free_inode()") made the
> deallocation of 'socket_alloc' to be done asynchronously using RCU, as
> same to 'sock.wq'. And the following commit 333f7909a857 ("coallocate
> socket_sq with socket itself") made those to have same life cycle.
>
> The changes made the code much more simple, but also made 'socket_alloc'
> live longer than before. For the reason, user programs intensively
> repeating allocations and deallocations of sockets could cause memory
> pressure on recent kernels.

I found this problem on a production virtual machine utilizing 4GB memory while
running lebench[1]. The 'poll big' test of lebench opens 1000 sockets, polls
and closes those. This test is repeated 10,000 times. Therefore it should
consume only 1000 'socket_alloc' objects at once. As size of socket_alloc is
about 800 Bytes, it's only 800 KiB. However, on the recent kernels, it could
consume up to 10,000,000 objects (about 8 GiB). On the test machine, I
confirmed it consuming about 4GB of the system memory and results in OOM.

[1] https://github.com/LinuxPerfStudy/LEBench

>
> To avoid the problem, this commit reverts the changes.

I also tried to make fixup rather than reverts, but I couldn't easily find
simple fixup. As the commits 6d7855c54e1e and 333f7909a857 were for code
refactoring rather than performance optimization, I thought introducing complex
fixup for this problem would make no sense. Meanwhile, the memory pressure
regression could affect real machines. To this end, I decided to quickly
revert the commits first and consider better refactoring later.


Thanks,
SeongJae Park

>
> SeongJae Park (2):
> Revert "coallocate socket_wq with socket itself"
> Revert "sockfs: switch to ->free_inode()"
>
> drivers/net/tap.c | 5 +++--
> drivers/net/tun.c | 8 +++++---
> include/linux/if_tap.h | 1 +
> include/linux/net.h | 4 ++--
> include/net/sock.h | 4 ++--
> net/core/sock.c | 2 +-
> net/socket.c | 23 ++++++++++++++++-------
> 7 files changed, 30 insertions(+), 17 deletions(-)
>
> --
> 2.17.1