Re: [RFC PATCH 0/0 v3] callchain fixes and cleanups

From: Ingo Molnar
Date: Wed Aug 18 2010 - 12:15:54 EST



* Will Deacon <will.deacon@xxxxxxx> wrote:

> On Wed, 2010-08-18 at 04:55 +0100, Frederic Weisbecker wrote:
> > On Tue, Aug 17, 2010 at 11:32:39AM +0100, Will Deacon wrote:
> > > I've tested this on an ARM Cortex-A9 board and it all seems fine [plus
> > > the code is a lot cleaner!].
> > >
> > > Tested-by: Will Deacon <will.deacon@xxxxxxx>
> >
> > Thanks a lot!
>
> > BTW, out of curiosity, do you have NMIs on ARM and do the hardware events
> > make use of them? Or may be you use FIQ to simulate NMIs?
> >
>
> We don't have NMIs on ARM [so obviously we can't use them!] but you're right
> to point out the FIQ. I've actually been thinking about this during the past
> week, but there are the following problems:
>
> (1) The FIQ isn't always wired up in the hardware, so you can't
> assume that it is available.

We dont always have NMIs on x86 either - we fall back to hrtimers in that
case.

> (2) The FIQ can only have a single handler at a given time. This
> is because it is a separate exception mode, with its own banked
> registers. Consequently, we might not be able to use it if it's
> being used for something else.

Technically the NMI is only a single exception source on x86 as well. We
multiplex from there - if there are multiple users we call them using a
notifier chain.

> (3) The Trustzone security extensions may reserve the FIQ for secure
> use only or make it available only via the secure monitor [which
> will increase latency].

As long as it can still be detected during PMU init and set up safely, it
should be OK.

> Of course, the advantage is that we could then use sample-based profiling
> techniques in sections of code where the interrupts are disabled.

Once you've tried NMI profiling you wont be going back - the difference is day
and night ;-)

Here's the profile of a scheduling-intense workload using a timer based
fallback path:

# Events: 586 cycles
#
# Overhead Command Shared Object Symbol
# ........ ............ ................. ...........................
#
21.33% pipe-test-1m [kernel.kallsyms] [k] finish_task_switch
14.33% pipe-test-1m [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
4.61% pipe-test-1m [kernel.kallsyms] [k] avc_has_perm_noaudit
3.75% pipe-test-1m [kernel.kallsyms] [k] pipe_read
3.58% pipe-test-1m libc-2.12.so [.] __write_nocancel
3.58% pipe-test-1m [kernel.kallsyms] [k] copy_user_generic_string
3.41% pipe-test-1m libc-2.12.so [.] __GI___libc_read
3.07% pipe-test-1m [kernel.kallsyms] [k] system_call_after_swapgs
3.07% pipe-test-1m [kernel.kallsyms] [k] pipe_write
3.07% pipe-test-1m [kernel.kallsyms] [k] file_has_perm
2.22% pipe-test-1m pipe-test-1m [.] main
2.22% pipe-test-1m [kernel.kallsyms] [k] selinux_file_permission
1.88% pipe-test-1m [kernel.kallsyms] [k] rw_verify_area
1.88% pipe-test-1m [kernel.kallsyms] [k] fsnotify


# Events: 23K cycles
#
# Overhead Command Shared Object Symbol
# ........ ............ ................. ...................................
#
7.14% pipe-test-1m [kernel.kallsyms] [k] __default_send_IPI_dest_field
4.34% pipe-test-1m [kernel.kallsyms] [k] schedule
4.27% pipe-test-1m [kernel.kallsyms] [k] __switch_to
3.88% pipe-test-1m [kernel.kallsyms] [k] pipe_read
3.57% pipe-test-1m [kernel.kallsyms] [k] switch_mm
3.45% pipe-test-1m [kernel.kallsyms] [k] file_has_perm
3.37% pipe-test-1m [kernel.kallsyms] [k] copy_user_generic_string
3.37% pipe-test-1m pipe-test-1m [.] main
3.20% pipe-test-1m [kernel.kallsyms] [k] avc_has_perm_noaudit
2.62% pipe-test-1m libc-2.12.so [.] __GI___libc_read
2.09% pipe-test-1m [kernel.kallsyms] [k] fsnotify
1.96% pipe-test-1m [kernel.kallsyms] [k] system_call
1.94% pipe-test-1m [kernel.kallsyms] [k] pipe_write
1.90% pipe-test-1m libc-2.12.so [.] __write_nocancel
1.88% pipe-test-1m [kernel.kallsyms] [k] mutex_lock
1.88% pipe-test-1m [kernel.kallsyms] [k] selinux_file_permission
1.78% pipe-test-1m [kernel.kallsyms] [k] mutex_unlock
1.66% pipe-test-1m [kernel.kallsyms] [k] _raw_spin_lock_irqsave
1.50% pipe-test-1m [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.23% pipe-test-1m [kernel.kallsyms] [k] vfs_read
1.19% pipe-test-1m [kernel.kallsyms] [k] do_sync_read
1.18% pipe-test-1m [kernel.kallsyms] [k] update_curr

The NMI output is an order of magnitude richer in information.

> The only way I can think of adding this is as a Kconfig option, which, when
> selected, tries to use the FIQ and then falls back to normal IRQs if it
> fails.

Dynamic detection and a fallback path, should be perfectly OK. Kconfig options
have the disadvantage of doubling the test space and halving the tester base
(or worse).

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/