Re: RFC: revert request for cpuidle patches e11538d1 and 69a37bea

From: Jeremy Eder
Date: Mon Jul 29 2013 - 13:01:08 EST

On 130729 23:57:31, Youquan Song wrote:
> Hi Jeremy,
> I try reproduce your result and then fix the issue, but I do not reproduce it
> yet.
> I run at netperf-2.6.0 at one machine as server: netserver, other
> machine: netperf -t TCP_RR -H $SERVER_IP -l 60. The target machine is
> used in both client and server. I do not reproduce the performance drop
> issue. I also notice the result is not stable, sometime it is high,
> sometime is low. In sumarry, it is hard to make a definite result.
> Can you try tell me how to reproduce the issue? how do you get the C0
> data?
> What's your config for kernel? Do you enable CONFIG_NO_HZ_FULL=y or
> only CONFIG_NO_HZ=y?
> Thanks
> -Youquan


To answer both your and Daniel's question, those results used only

These network latency benchmarks are fickle creatures, and need careful
tuning to become reproducible. Plus there are BIOS implications and tuning
varies by vendor.

Anyway for the most part it's probably not stable because in order to get
any sort
of reproducibility between runs you need to do at least these steps:

- ensure as little is running in userspace as possible
- determine PCI affinity for the NIC
- on both machines, isolate the socket connected to the NIC from userspace
- Turn off irqbalance and bind all IRQs for that NIC to a single core on
the same socket as the NIC
- run netperf with -TX,Y where X,Y are core numbers that you wish
netperf/netserver to run on, respectively.

For example, if your NIC is attached to socket 0 and socket 0 cores are
enumerated 0-7, then:

- set /proc/irq/NNN/smp_affinity_list to, say, 6 for all vectors on that
- nice -20 netperf -t TCP_RR - $SERVER_IP -l 60 -T4,4 -s 2

That should get you most of the way there. The -s 2 connects and waits 2
seconds, I found this to help with the first few second's worth of data.
you could just toss the first 2 seconds worth, it seems to take that long
to stabilize. What I mean is, if you're not using -D1,1 option to netperf,
you might not have seen that netperf tests seem to take a few seconds to
stabilize even
when properly tuned.

I got the C0 data by running turbostat in parallel with each benchmark run,
then grabbing the C-state data for the cores relevant to the test. In my
case that was cores 4 and 6, where core 4 was where I put netperf/netserver
and core 6 was where I put the NIC IRQs. Then I parsed that output into a
format that this could interpret:

I'm building a kernel from Rafael's tree and will try to confirm what Len
already sent. Thanks everyone for looking into it.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at