Re: [profile] amortize atomic hit count increments

From: Andrea Arcangeli
Date: Tue Sep 14 2004 - 11:37:36 EST


On Tue, Sep 14, 2004 at 08:51:03AM -0700, William Lee Irwin III wrote:
> On Mon, Sep 13, 2004 at 09:47:48PM -0700, William Lee Irwin III wrote:
> >> timer interrupt, usually at boot. The following patch attempts to
> >> amortize the atomic operations done on the profile buffer to address
> >> this stability concern. This patch has nothing to do with performance;
>
> On Tue, Sep 14, 2004 at 01:34:19PM +0200, Andrea Arcangeli wrote:
> > isn't it *much* simpler and much more efficient to just have a per-cpu
> > idle function? I seriously doubt you'll get simultaneous collisions on
> > anything but the 'halt' instruction in the idle function.
>
> Sampling the profile buffer at regular intervals shows far less than
> 256 distinct functions hit in 1s intervals even with all cpus busy. As
> for whether that would be sufficient, that will have to be answered by
> those who reported the bug. I suppose to test whether things besides
> idling do cause this problem, one would boot with a restricted
> prof_cpu_mask, load all cpus on the machine, set the prof_cpu_mask to
> unrestricted, and see if it livelocks before the load terminates.

It probably worth to measure it. The real bottleneck happens when all
cpus tries to get an exclusive lock on the same cacheline at the *same*
time. 1 second is a pretty long time, if there's no contention of the
cacheline, things are normally ok.

this is basically the same issue we had with RCU since all timers fired
at the same wall clock time, and all of them tried to change bits in the
same cacheline at the same time, that is a workload that collapse a
512-way machine ;). The profile timer is no different.

Simply removing the idle time accounting would fix it, however this
cripple down functionality a little bit, but it'll be a very good way to
test if my theory is correct, or if you truly need some per-cpu logic in
the profiler.

You could also fake it, have a per-cpu counter only for the current->pid
case, and then once somebody reads /proc/profile, you flush the total
per-cpu count to the counter in the buffer that corresponds to the EIP
of the idle func.

Before dedicidng I'd suggest to have a look and see how the below patch
compares to your approch in performance terms.

--- sles/arch/ia64/kernel/time.c.~1~ 2004-08-25 02:47:33.000000000 +0200
+++ sles/arch/ia64/kernel/time.c 2004-09-14 18:01:39.792182008 +0200
@@ -206,6 +206,9 @@ ia64_do_profile (struct pt_regs * regs)
if (!prof_buffer)
return;

+ if (!current->pid)
+ return;
+
ip = instruction_pointer(regs);
/* Conserve space in histogram by encoding slot bits in address
* bits 2 and 3 rather than bits 0 and 1.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/