Re: [RFC] Introduce to batch variants of accept() and epoll_ctl()syscall

From: Li Yu
Date: Fri Jul 06 2012 - 05:38:39 EST

ä 2012å06æ15æ 16:51, Eric Dumazet åé:
On Fri, 2012-06-15 at 13:37 +0800, Li Yu wrote:

Of course, I think that implementing them should not be a hard work :)

Em. I really do not know whether it is necessary to introduce to a new
syscall here. An alternative solution to add new socket option to handle
such batch requirement, so applications also can detect if kernel has
this extended ability with a easy getsockopt() call.

Any way, I am going to try to write a prototype first.

Before that, could you post the result of "perf top", or "perf
record ...;perf report"

Sorry for I just have time to write a benchmark to reproduce this
problem on my test bed, below are results of "perf record -g -C 0".
kernel is 3.4.0:

Events: 7K cycles
+ 54.87% swapper [kernel.kallsyms] [k] poll_idle
- 3.10% :22984 [kernel.kallsyms] [k] _raw_spin_lock
- _raw_spin_lock
- 64.62% sch_direct_xmit
- ip_local_out
+ 49.48% ip_queue_xmit
+ 37.48% ip_build_and_send_pkt
+ 13.04% ip_send_skb

I can not reproduce complete same high CPU usage on my testing environment, but top show that it has similar ratio of sys% and
si% on one CPU:

Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie
Cpu0 : 1.0%us, 30.7%sy, 0.0%ni, 18.8%id, 0.0%wa, 0.0%hi, 49.5%si, 0.0%st

Well, it seem that I must acknowledge I was wrong here. however,
I recall that I indeed ever encountered this in another benchmarking a
small packets performance.

I guess, this is since TX softirq and syscall context contend same lock
in sch_direct_xmit(), is this right?



The top shows the kernel is most cpu hog, the testing is simple,
just a accept() -> epoll_ctl(ADD) loop, the ratio of cpu util sys% to
si% is about 2:5.

This ratio is not meaningful, if we dont know where time is spent.

I doubt epoll_ctl(ADD) is a problem here...

If it is, batching the fds wont speed the thing anyway...

I believe accept() is the problem here, because it contends with the
softirq processing the tcp session handshake.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at