Re: rcu: Throttle rcu_try_advance_all_cbs() execution causesvisible slowdown in ftrace switching
From: Petr Mladek
Date: Tue Nov 05 2013 - 10:43:15 EST
Hello Paul,
Paul E. McKenney pÃÅe v Po 04. 11. 2013 v 09:02 -0800:
> On Fri, Nov 01, 2013 at 06:19:44PM +0100, Petr Mladek wrote:
> > Hi,
> >
> > I am doing some clean up in x86 ftrace code. I check the performance by
> > switching between different tracers and by enabling and disabling them.
> >
> > The operation has started to be much slower after rebasing on the
> > kernel tip tree. Bisecting has shown that the difference was caused by
> > the commit c229828ca6bc62d6c654 (rcu: Throttle
> > rcu_try_advance_all_cbs() execution)
> This is a slowpath, and that commit did fix a real bug, so I am OK with
> this modest slowdown.
>
> That said, if you have a workload where this is a problem, please try
> building with CONFIG_RCU_FAST_NO_HZ=n. The fact that this commit had any
> effect at all leads me to believe that you used CONFIG_RCU_FAST_NO_HZ=y.
Yes, I used CONFIG_RCU_FAST_NO_HZ=y.
I am not aware of any other workload with this problem. I tried few
benchmarks: dbench, unixbench, and aim. They did not show any
considerable difference with and without the commit. If you are
interested you might find more details in the attached logs.
Just for record, I also checked how the ftrace test was affected by the
commit under various system load. The speed difference was there if at
least one CPU was idle. But the test was slower on idle system even
without the patch. Hence this is not the only change that causes some
difference. See the attached "ftrace" file for more details.
I am still a bit curious why ftrace code is so special here and why it
does not affect the other benchmarks. Anyway, I agree that ftrace
change/start/stop operations are not time critical and the extra delay
might be worth fixing the other bug. I am fine with it :-)
Best Regards,
Petr
Attachment:
aim9
Description: application/gmc-link
Attachment:
dbench
Description: application/gmc-link
test: ftrace
CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
RAM: 4GB
Started using the command:
time ./test-ftrace
Where test-ftrace is:
--- cut ---
#!/bin/bash
echo "Testing ftrace - begin"
cd /sys/kernel/debug/tracing
count=0
while test $count -lt 100 ; do
for tracer in function nop ; do
echo $tracer >current_tracer
echo 1 >tracing_on
echo 0 >tracing_on
done
count=$(($count + 1))
done
cd -
echo "Testing ftrace - end"
--- cut ---
and the load was generated using the script make-load:
--- cut ---
#!/bin/bash
a=1
while true ; do
a=$(($a+1))
done
--- cut ---
Now, results with no load on the system; with the commit c229828ca6bc62d
(rcu: Throttle rcu_try_advance_all_cbs() execution)
real 0m49.393s 0m49.632s 0m49.359
user 0m0.004s 0m0.000s 0m0.004s
sys 0m0.996s 0m0.880s 0m0.892s
and with reverted c229828ca6bc62d (rcu: Throttle rcu_try_advance_all_cbs()
execution)
real 0m35.320s 0m35.687s 0m35.920s
user 0m0.004s 0m0.004s 0m0.000s
sys 0m1.140s 0m1.208s 0m1.152s
---
Results with 100% load on one CPU and no load on 2nd CPU (started the script
make-load once) with c229828ca6bc62d (rcu: Throttle rcu_try_advance_all_cbs()
execution)
real 0m44.964s 0m45.083s 0m45.171s
user 0m0.000s 0m0.000s 0m0.004s
sys 0m1.580s 0m1.612s 0m1.576s
and with reverted c229828ca6bc62d (rcu: Throttle rcu_try_advance_all_cbs()
execution)
real 0m30.816s 0m31.326s 0m31.191s
user 0m0.004s 0m0.004s 0m0.008s
sys 0m1.240s 0m1.248s 0m1.232s
---
Results with 100% load on both CPUs (started script make-load twice)
with c229828ca6bc62d (rcu: Throttle rcu_try_advance_all_cbs() execution)
real 0m20.294s 0m20.538s 0m20.470s
user 0m0.004s 0m0.000s 0m0.000s
sys 0m1.284s 0m1.388s 0m1.612s
and with reverted c229828ca6bc62d (rcu: Throttle rcu_try_advance_all_cbs()
execution)
real 0m20.305s 0m20.056s 0m20.259
user 0m0.000s 0m0.000s 0m0.000s
sys 0m1.416s 0m1.452s 0m1.560s
Attachment:
unixbench
Description: application/gmc-link