Re: [PATCH net-next 14/15 v2] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.

From: Sebastian Andrzej Siewior
Date: Fri May 17 2024 - 12:16:10 EST


On 2024-05-14 14:20:03 [+0200], Jesper Dangaard Brouer wrote:
> Trick for CPU-map to do early drop on remote CPU:
>
> # ./xdp-bench redirect-cpu --cpu 3 --remote-action drop ixgbe1
>
> I recommend using Ctrl+\ while running to show more info like CPUs being
> used and what kthread consumes. To catch issues e.g. if you are CPU
> redirecting to same CPU as RX happen to run on.

Okay. So I reworked the last two patches make the struct part of
task_struct and then did as you suggested:

Unpatched:
|Sending:
|Show adapter(s) (eno2np1) statistics (ONLY that changed!)
|Ethtool(eno2np1 ) stat: 952102520 ( 952,102,520) <= port.tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14876602 ( 14,876,602) <= port.tx_size_64 /sec
|Ethtool(eno2np1 ) stat: 14876602 ( 14,876,602) <= port.tx_unicast /sec
|Ethtool(eno2np1 ) stat: 446045897 ( 446,045,897) <= tx-0.bytes /sec
|Ethtool(eno2np1 ) stat: 7434098 ( 7,434,098) <= tx-0.packets /sec
|Ethtool(eno2np1 ) stat: 446556042 ( 446,556,042) <= tx-1.bytes /sec
|Ethtool(eno2np1 ) stat: 7442601 ( 7,442,601) <= tx-1.packets /sec
|Ethtool(eno2np1 ) stat: 892592523 ( 892,592,523) <= tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14876542 ( 14,876,542) <= tx_packets /sec
|Ethtool(eno2np1 ) stat: 2 ( 2) <= tx_restart /sec
|Ethtool(eno2np1 ) stat: 2 ( 2) <= tx_stopped /sec
|Ethtool(eno2np1 ) stat: 14876622 ( 14,876,622) <= tx_unicast /sec
|
|Receive:
|eth1->? 8,732,508 rx/s 0 err,drop/s
| receive total 8,732,508 pkt/s 0 drop/s 0 error/s
| cpu:10 8,732,508 pkt/s 0 drop/s 0 error/s
| enqueue to cpu 3 8,732,510 pkt/s 0 drop/s 7.00 bulk-avg
| cpu:10->3 8,732,510 pkt/s 0 drop/s 7.00 bulk-avg
| kthread total 8,732,506 pkt/s 0 drop/s 205,650 sched
| cpu:3 8,732,506 pkt/s 0 drop/s 205,650 sched
| xdp_stats 0 pass/s 8,732,506 drop/s 0 redir/s
| cpu:3 0 pass/s 8,732,506 drop/s 0 redir/s
| redirect_err 0 error/s
| xdp_exception 0 hit/s

I verified that the "drop only" case hits 14M packets/s while this
redirect part reports 8M packets/s.

Patched:
|Sending:
|Show adapter(s) (eno2np1) statistics (ONLY that changed!)
|Ethtool(eno2np1 ) stat: 952635404 ( 952,635,404) <= port.tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14884934 ( 14,884,934) <= port.tx_size_64 /sec
|Ethtool(eno2np1 ) stat: 14884928 ( 14,884,928) <= port.tx_unicast /sec
|Ethtool(eno2np1 ) stat: 446496117 ( 446,496,117) <= tx-0.bytes /sec
|Ethtool(eno2np1 ) stat: 7441602 ( 7,441,602) <= tx-0.packets /sec
|Ethtool(eno2np1 ) stat: 446603461 ( 446,603,461) <= tx-1.bytes /sec
|Ethtool(eno2np1 ) stat: 7443391 ( 7,443,391) <= tx-1.packets /sec
|Ethtool(eno2np1 ) stat: 893086506 ( 893,086,506) <= tx_bytes /sec
|Ethtool(eno2np1 ) stat: 14884775 ( 14,884,775) <= tx_packets /sec
|Ethtool(eno2np1 ) stat: 14 ( 14) <= tx_restart /sec
|Ethtool(eno2np1 ) stat: 14 ( 14) <= tx_stopped /sec
|Ethtool(eno2np1 ) stat: 14884937 ( 14,884,937) <= tx_unicast /sec
|
|Receive:
|eth1->? 8,735,198 rx/s 0 err,drop/s
| receive total 8,735,198 pkt/s 0 drop/s 0 error/s
| cpu:6 8,735,198 pkt/s 0 drop/s 0 error/s
| enqueue to cpu 3 8,735,193 pkt/s 0 drop/s 7.00 bulk-avg
| cpu:6->3 8,735,193 pkt/s 0 drop/s 7.00 bulk-avg
| kthread total 8,735,191 pkt/s 0 drop/s 208,054 sched
| cpu:3 8,735,191 pkt/s 0 drop/s 208,054 sched
| xdp_stats 0 pass/s 8,735,191 drop/s 0 redir/s
| cpu:3 0 pass/s 8,735,191 drop/s 0 redir/s
| redirect_err 0 error/s
| xdp_exception 0 hit/s

This looks to be in the same range/ noise level. top wise I have
ksoftirqd at 100% and cpumap/./map at ~60% so I hit CPU speed limit on a
10G link. perf top shows
| 18.37% bpf_prog_4f0ffbb35139c187_cpumap_l4_hash [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash
| 13.15% [kernel] [k] cpu_map_kthread_run
| 12.96% [kernel] [k] ixgbe_poll
| 6.78% [kernel] [k] page_frag_free
| 5.62% [kernel] [k] xdp_do_redirect

for the top 5. Is this something that looks reasonable?

Sebastian