Re: Linux 2.6.32-rc1

From: Eric Dumazet
Date: Mon Sep 28 2009 - 14:42:23 EST


Martin Schwidefsky a Ãcrit :
> On Mon, 28 Sep 2009 08:39:31 -0700 (PDT)
> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> On Mon, 28 Sep 2009, Eric Dumazet wrote:
>>> Linus Torvalds a Ãcrit :
>>>> Go wild, test it out, and let us know about any regressions you find,
>>> Something seems wrong with process time accounting.
>>>
>>> Following program should consume 10*8 seconds of cpu on a 8 cpu machine, but
>>> with 2.6.32-rc1 numbers are crazy.
>> Ok, so the top-level process time looks sane _while_ the thing is running
>> (sum of all threads), but the per-thread times look broken (as if they had
>> randomly had the times of the other threads added into them - looks to me
>> like they on average had half the other threads' time accounted into
>> them).
>>
>> And then at the end, it looks like the time of the threads (which was
>> over-accounted) get re-accounted back into the main process, so the final
>> time is indeed wildly inflated.
>>
>> IOW, looks like we're adding thread times multiple times to other threads
>> (and then finally to the parent).
>>
>> I'm adding the usual suspects to the cc, and leaving your results and
>> test-program quoted here for them.. Thomas, Martin, John - if you're not
>> the people I should blame for this, let me know.
>
> Uh-oh.. usual suspects, eh?
>
> For starters I run the test program on an s390 with
> VIRT_CPU_ACCOUNTING=y:
>
> time ./burn-cpu
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:08 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> - - Rl+ 0:01 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:16 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> - - Rl+ 0:02 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:25 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> - - Rl+ 0:03 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:33 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> - - Rl+ 0:04 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:41 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> - - Rl+ 0:05 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:50 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> - - Rl+ 0:06 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 0:58 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> - - Rl+ 0:07 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 1:07 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> - - Rl+ 0:08 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 1:15 ./burn-cpu
> - - Sl+ 0:00 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> - - Rl+ 0:09 -
> PID TTY STAT TIME COMMAND
> 2979 pts/0 - 1:25 ./burn-cpu
> - - S+ 1:25 -
>
> real 0m10.708s
> user 1m25.051s
> sys 0m0.174s
>
> looks better, gives an average of 10.63 seconds per thread and the per
> thread numbers are fine. I'll see what happens with the test case on my
> pc@home.
>


I did a bisection and found commit def0a9b2573e00ab0b486cb5382625203ab4c4a6
was the origin of the problem on my x86_32 machine.

def0a9b2573e00ab0b486cb5382625203ab4c4a6 is first bad commit
commit def0a9b2573e00ab0b486cb5382625203ab4c4a6
Author: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Date: Fri Sep 18 20:14:01 2009 +0200

sched_clock: Make it NMI safe

Arjan complained about the suckyness of TSC on modern machines, and
asked if we could do something about that for PERF_SAMPLE_TIME.

Make cpu_clock() NMI safe by removing the spinlock and using
cmpxchg. This also makes it smaller and more robust.

Affects architectures that use HAVE_UNSTABLE_SCHED_CLOCK, i.e. IA64
and x86.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

git bisect start
# bad: [17d857be649a21ca90008c6dc425d849fa83db5c] Linux 2.6.32-rc1
git bisect bad 17d857be649a21ca90008c6dc425d849fa83db5c
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# good: [5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3] Staging: vt6656: Integrate vt6656 into build system.
git bisect good 5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3
# good: [5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3] Staging: vt6656: Integrate vt6656 into build system.
git bisect good 5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3
# bad: [c720f5655df159a630fa0290a0bd67c93e92b0bf] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
git bisect bad c720f5655df159a630fa0290a0bd67c93e92b0bf
# good: [a03fdb7612874834d6847107198712d18b5242c7] Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good a03fdb7612874834d6847107198712d18b5242c7
# good: [3530c1886291df061e3972c55590777ef1cb67f8] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
git bisect good 3530c1886291df061e3972c55590777ef1cb67f8
# good: [84d6ae431f315e8973aac3c3fe1d550fc9240ef3] V4L/DVB (13033): pt1: Don't use a deprecated DMA_BIT_MASK macro
git bisect good 84d6ae431f315e8973aac3c3fe1d550fc9240ef3
# bad: [ebc79c4f8da0f92efa968e0328f32334a2ce80cf] Merge git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/linux-2.6
git bisect bad ebc79c4f8da0f92efa968e0328f32334a2ce80cf
# good: [7bd032dc2793afcbaf4a350056768da84cdbd89b] USB serial: update the console driver
git bisect good 7bd032dc2793afcbaf4a350056768da84cdbd89b
# good: [1281a49b7aa865a1f0d60e2b28410e6234fc686b] perf trace: Sample timestamp and cpu when using record flag
git bisect good 1281a49b7aa865a1f0d60e2b28410e6234fc686b
# bad: [e11c675ede0d42a405ae595528bf0b29ce1ae56f] Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
git bisect bad e11c675ede0d42a405ae595528bf0b29ce1ae56f
# bad: [6161352142d5fed4cd753b32e5ccde66e705b14e] tracing, perf: Convert the power tracer into an event tracer
git bisect bad 6161352142d5fed4cd753b32e5ccde66e705b14e
# bad: [929bf0d0156562ce631728b6fa53d68004d456d2] Merge branch 'linus' into perfcounters/core
git bisect bad 929bf0d0156562ce631728b6fa53d68004d456d2
# good: [5622f295b53fb60dbf9bed3e2c89d182490a8b7f] x86, perf_counter, bts: Optimize BTS overflow handling
git bisect good 5622f295b53fb60dbf9bed3e2c89d182490a8b7f
# bad: [def0a9b2573e00ab0b486cb5382625203ab4c4a6] sched_clock: Make it NMI safe
git bisect bad def0a9b2573e00ab0b486cb5382625203ab4c4a6


My .config file is attached to this mail

Attachment: .config
Description: application/xml