On 3/19/2025 1:44 PM, Nikhil Dhama wrote:
[...]
And, do you run network related workloads on one machine? If so, please
try to run them on two machines instead, with clients and servers run on
different machines. At least, please use different sockets for clients
and servers. Because larger pcp->free_count will make it easier to
trigger free_high heuristics. If that is the case, please try to
optimize free_high heuristics directly too.
I agree with Ying Huang, the above change is not the best possible fix for
the issue. On futher analysis I figured that root cause of the issue is
the frequent pcp high order flushes. During a 20sec iperf3 run
I observed on avg 5 pcp high order flushes in kernel v6.6, whereas, in
v6.7, I observed about 170 pcp high order flushes.
Tracing pcp->free_count, I figured with the patch v1 (patch I suggested
earlier) free_count is going into negatives which reduces the number of
times free_high heuristics is triggered hence reducing the high order
flushes.
As Ying Huang Suggested, it helps the performance on increasing the batch size
for free_high heuristics. I tried different scaling factors to find best
suitable batch value for free_high heuristics,
score # free_high
----------- ----- -----------
v6.6 (base) 100 4
v6.12 (batch*1) 69 170
batch*2 69 150
batch*4 74 101
batch*5 100 53
batch*6 100 36
batch*8 100 3
scaling batch for free_high heuristics with a factor of 5 restores the
performance.
Hello Nikhil,
Thanks for looking further on this. But from design standpoint,
how a batch-size of 5 is helping here is not clear (Andrew's original
question).
Any case can you post the patch-set in a new email so that the below
patch is not lost in discussion thread?