Re: [PATCH] atmel_tc clocksource/clockevent code

From: Remy Bohmer
Date: Wed Mar 05 2008 - 06:17:40 EST


Hello David,

> Could you elaborate on where that 50-100 usec gets spent?

Attached I have put a screendump of my ETM debugger. It shows a
complete flow of kernel function-calls of what happens on a timer
interrupt. In this example the complete sequence takes about 154 us.
Notice that the ETM is non-intrusive, and that the times are real and
accurate in this trace. (you can even see the effects of CPU-caches,
sometimes the same code just runs faster)
It is based on 2.6.23.12-rt14, with a ticking kernel, I have attached
the .config of the kernel under trace also. The CPU is running at
160/80Mhz.
I have some custom patches to the kernel, but none of them effect this
particular trace.

> Does the same issue happpen with the $SUBJECT patch (if you tweak the
> clocksource ratings to use its clockevents on rm9200)?

not tested yet, but I will generate a trace for it, I will post it later.

> I'd not think the timer IRQ logic accounts for much of that time,
> and the rest of the instruction cycles would seem to be either
> in genirq or gentime.

There is more to it than just the genIRQ mechanism. The softirqs are
kicked, the scheduler is triggered and so on. It is a waterfall of
events that happen, just by having a timer interrupt.

> Unless the issue is that accessing that
> particular hardware is slow, with clockdomain crossing etc ...
> or that AT91 has way too many handlers sitting on IRQ-1.
> If it's a gentime or genirq issue, that's somewhat amenable to
> fixing ... and it would also impact the $SUBJECT patch both on
> other AT91 ARM9 cores and on the AVR32 AP7 cores.

> > So, a higher resolution than 30usec is useless on these cores, and
> > setting timers on a frequency that high will just choke the processor
> > in only handling timers...
> Should the min_delta_ns be increased in at91rm9200_time.c then?

Maybe it should be configurable for these kinds of CPUs?

If a CPU is infinite fast, one could probably want a HRT resolution of zero.
But, no CPUs are that fast, and I assume there is always a need to get
a good balance between timer resolution compared to CPU load.
We probably do not want to cause a DoS by building some application
that starts (periodic) timers that fast, that the CPU is doing nothing
more than just that. To me, it appears that the HRT framework is going
that route here. It seems to me now, that if it is technically
possible to create a microsec accurate timer-framework, we make them,
regardless of it fits the rest of the system.
Notice that I also fell in this pitfall while using HRT, and I only
wanted an application that made a 1ms accurate timer... Other
processes/daemons in the system also uses timers, which eventually
resulted in intervals in the sub-millisec range, and thus due to the
overhead that will bring tot the system, the CPU-load just goes
sky-high, doing actually nothing really special.

So, hires timestamps -> really really welcome.
hires timers -> there should be a (configurable) minimal resolution
that fits the hardware to not overload the CPU.

> Right now, as you probably recall, it's at the lowest value
> needed for correctness: a smidgeon over two ticks (~ 72 nsec).

I remember...

Kind Regards,

Remy

Attachment: trace-timer-interrupt-rm9200.png
Description: PNG image

Attachment: .config
Description: Binary data