Re: [PATCH v2 0/5] riscv: improving uaccess with logs from network bench

From: Ben Dooks
Date: Sun Jun 20 2021 - 06:23:21 EST

Next message: Desmond Cheong Zhi Xi: "[PATCH v3 2/2] drm: protect drm_master pointers in drm_lease.c"
Previous message: Ani Sinha: "[PATCH v3] Add kernel logs when sched clock unstable and NO_HZ_FULL is not possible"
In reply to: Akira Tsukamoto: "Re: [PATCH 5/5] riscv: __asm_to/copy_from_user: Bulk copy when both src, dst are aligned"
Next in thread: Akira Tsukamoto: "Re: [PATCH v2 0/5] riscv: improving uaccess with logs from network bench"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 19/06/2021 12:21, Akira Tsukamoto wrote:

Optimizing copy_to_user and copy_from_user.

I rewrote the functions in v2, heavily influenced by Garry's memcpy
function [1].
The functions must be written in assembler to handle page faults manually
inside the function.

With the changes, improves in the percentage usage and some performance
of network speed in UDP packets.
Only patching copy_user. Using the original memcpy.

All results are from the same base kernel, same rootfs and same
BeagleV beta board.

Is there a git tree for these to try them out?

Comparison by "perf top -Ue task-clock" while running iperf3.

--- TCP recv ---
* Before
40.40% [kernel] [k] memcpy
33.09% [kernel] [k] __asm_copy_to_user
* After
50.35% [kernel] [k] memcpy
13.76% [kernel] [k] __asm_copy_to_user

--- TCP send ---
* Before
19.96% [kernel] [k] memcpy
   9.84% [kernel] [k] __asm_copy_to_user
* After
14.27% [kernel] [k] memcpy
   7.37% [kernel] [k] __asm_copy_to_user

--- UDP send ---
* Before
25.18% [kernel] [k] memcpy
22.50% [kernel] [k] __asm_copy_to_user
* After
28.90% [kernel] [k] memcpy
   9.49% [kernel] [k] __asm_copy_to_user

--- UDP recv ---
* Before
44.45% [kernel] [k] memcpy
31.04% [kernel] [k] __asm_copy_to_user
* After
55.62% [kernel] [k] memcpy
11.22% [kernel] [k] __asm_copy_to_user

What's the memcpy figure in the above?
Could you explain the figures please?

Processing network packets require a lot of unaligned access for the packet
header, which is not able to change the design of the header format to be
aligned.

Isn't there an option to allow padding of network packets
in the skbuff to make the fields aligned for architectures
which do not have efficient unaligned loads (looking at you
arm32). Has this been looked at?

And user applications call system calls with a large buffer for send/recf()
and sendto/recvfrom() to repeat less function calls for the optimization.

v1 -> v2:
- Added shift copy
- Separated patches for readability of changes in assembler
- Using perf results

[1] https://lkml.org/lkml/2021/2/16/778

Akira Tsukamoto (5):
riscv: __asm_to/copy_from_user: delete existing code
riscv: __asm_to/copy_from_user: Adding byte copy first
riscv: __asm_to/copy_from_user: Copy until dst is aligned address
riscv: __asm_to/copy_from_user: Bulk copy while shifting misaligned
data
riscv: __asm_to/copy_from_user: Bulk copy when both src dst are
aligned

arch/riscv/lib/uaccess.S | 181 +++++++++++++++++++++++++++++++--------
1 file changed, 146 insertions(+), 35 deletions(-)

I'm concerned that delete and then re-add is either going to make
the series un-bisectable or leave a point where the kernel is very
broken?

--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

Next message: Desmond Cheong Zhi Xi: "[PATCH v3 2/2] drm: protect drm_master pointers in drm_lease.c"
Previous message: Ani Sinha: "[PATCH v3] Add kernel logs when sched clock unstable and NO_HZ_FULL is not possible"
In reply to: Akira Tsukamoto: "Re: [PATCH 5/5] riscv: __asm_to/copy_from_user: Bulk copy when both src, dst are aligned"
Next in thread: Akira Tsukamoto: "Re: [PATCH v2 0/5] riscv: improving uaccess with logs from network bench"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]