Re: AIM7 40% regression with 2.6.26-rc1

From: Ingo Molnar
Date: Tue May 06 2008 - 07:45:43 EST



* Zhang, Yanmin <yanmin_zhang@xxxxxxxxxxxxxxx> wrote:

> Comparing with kernel 2.6.25, ïAIM7 (use tmpfs) has ïmore than 40% with
> 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton, and Itanium
> Montecito. Bisect located below patch.
>
> 64ac24e738823161693bf791f87adc802cf529ff is first bad commit
> commit 64ac24e738823161693bf791f87adc802cf529ff
> Author: Matthew Wilcox <matthew@xxxxxx>
> Date: Fri Mar 7 21:55:58 2008 -0500
>
> Generic semaphore implementation
>
> After I manually reverted the patch against 2.6.26-rc1 while fixing
> lots of conflictions/errors, aim7 regression became less than 2%.

hm, which exact semaphore would that be due to?

My first blind guess would be the BKL - there's not much other semaphore
use left in the core kernel otherwise that would affect AIM7 normally.
The VFS still makes frequent use of the BKL and AIM7 is very VFS
intense. Getting rid of that BKL use from the VFS might be useful to
performance anyway.

Could you try to check that it's indeed the BKL?

Easiest way to check it would be to run AIM7 it on
sched-devel.git/latest and do scheduler tracing via:

http://people.redhat.com/mingo/sched-devel.git/readme-tracer.txt

by doing:

echo stacktrace > /debug/tracing/iter_ctl

you could get exact backtraces of all scheduling points in the trace. If
the BKL's down() shows up in those traces then it's definitely the BKL
that causes this. The backtraces will also tell us exactly which BKL use
is the most frequent one.

To keep tracing overhead low on SMP i'd also suggest to only trace a
single CPU, via:

echo 1 > /debug/tracing/tracing_cpumask

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/