Re: [PATCH 1/5] Free up pf flag PF_KSOFTIRQD -v2

From: Peter Zijlstra
Date: Wed Dec 22 2010 - 04:17:42 EST


On Tue, 2010-12-21 at 17:09 -0800, Venkatesh Pallipadi wrote:
> Patchset:
> This is Part 2 of
> "Proper kernel irq time accounting -v4"
> http://lkml.indiana.edu/hypermail//linux/kernel/1010.0/01175.html
>
> and applies 2.6.37-rc7.
>
> Part 1 solves the way irqs are accounted in scheduler and tasks. This
> patchset solves how irq times are reported in /proc/stat and also not
> to include irq time in task->stime, etc.
>
> Example:
> Running a cpu intensive loop and network intensive nc on a 4 CPU system
> and looking at 'top' output.
>
> With vanilla kernel:
> Cpu0 : 0.0% us, 0.3% sy, 0.0% ni, 99.3% id, 0.0% wa, 0.0% hi, 0.3% si
> Cpu1 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
> Cpu2 : 1.3% us, 27.2% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 71.4% si
> Cpu3 : 1.6% us, 1.3% sy, 0.0% ni, 96.7% id, 0.0% wa, 0.0% hi, 0.3% si
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 7555 root 20 0 1760 528 436 R 100 0.0 0:15.79 nc
> 7563 root 20 0 3632 268 204 R 100 0.0 0:13.13 loop
>
> Notes:
> * Both tasks show 100% CPU, even when one of them is stuck on a CPU thats
> processing 70% softirq.
> * no hardirq time.
>
>
> With "Part 1" patches:
> Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
> Cpu1 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
> Cpu2 : 2.0% us, 30.6% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 67.4% si
> Cpu3 : 0.7% us, 0.7% sy, 0.3% ni, 98.3% id, 0.0% wa, 0.0% hi, 0.0% si
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 6289 root 20 0 3632 268 204 R 100 0.0 2:18.67 loop
> 5737 root 20 0 1760 528 436 R 33 0.0 0:26.72 nc
>
> Notes:
> * Tasks show 100% CPU and 33% CPU that correspond to their non-irq exec time.
> * no hardirq time.
>
>
> With "Part 1 + Part 2" patches:
> Cpu0 : 1.3% us, 1.0% sy, 0.3% ni, 97.0% id, 0.0% wa, 0.0% hi, 0.3% si
> Cpu1 : 99.3% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.7% hi, 0.0% si
> Cpu2 : 1.3% us, 31.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 8.3% hi, 58.9% si
> Cpu3 : 1.0% us, 2.0% sy, 0.3% ni, 95.0% id, 0.0% wa, 0.7% hi, 1.0% si
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 20929 root 20 0 3632 268 204 R 99 0.0 3:48.25 loop
> 20796 root 20 0 1760 528 436 R 33 0.0 2:38.65 nc
>
> Notes:
> * Both task exec time and hard irq time reported correctly.
> * hi and si time are based on fine granularity info and not on samples.
> * getrusage would give proper utime/stime split not including irq times
> in that ratio.
> * Other places that report user/sys time like, cgroup cpuacct.stat will
> now include only non-irq exectime.
>
> This patch:

Your 0/x seem repeated in here for some reason... I would expect on the
below little bit.

> Cleanup patch, freeing up PF_KSOFTIRQD and use per_cpu ksoftirqd pointer
> instead, as suggested by Eric Dumazet.
>
> Tested-by: Shaun Ruffell <sruffell@xxxxxxxxxx>
> Signed-off-by: Venkatesh Pallipadi <venki@xxxxxxxxxx>
> ---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/