Re: Kernel getting hosed?

From: Robert Hancock
Date: Fri Sep 25 2009 - 22:48:03 EST


On 09/25/2009 07:02 PM, Loren Rogers wrote:
Hello,
I am developing a multi-threaded media-based application written for
an iMX27-based processor running kernel 2.6.24. But I'm seeing a
weird "phenomenon" where certain processes/threads are not being
serviced and my clock (according to gettimeofday()) get's set back as
well. There are many symptoms to this behavior. Here are some
symptoms:

1. It's usually the same application-based threads that are either
being serviced or not serviced
2. The problem usually lasts for about 5 and a half minutes and then
appears to correct itself
3. I'll see the cpu load for my application-process quickly jump up to
99% right before the phenomenon (according to top)
4. My IP-telnet and serial terminal sessions are both unusable.
5. I have a logging utility with a timestamp feature (gettimeofday())
where, once this problem corrects itself, the clock has been set to
the exact time the problem started (i.e. let's say the problem starts
at 12:00:00, and I'll be logging msgs like 12:01:00, 12:04:22, etc...
Then after the problem "stops" the timestamp on my logger is once
again 12:00:00). And when I do a command "date" the clock will say
12:00:00!
6. I think all of my IP-based network threads are being serviced.
7. A colleague wrote a utility on one of the "alive" threads to start
collecting proc data once we know we are in this state; and he told me
that the proc counters have pretty much halted.


My colleagues and I have been chasing this for three weeks now. I
have no clue on how to determine the culprit(s). At first I thought
it was some bad code in the user-based application, but can someone
tell me with 100% certainty that this is either a user-space problem
or a kernel problem? If it is a kernel problem, how can a user-space
application hose a kernel to this extent?

If anybody can help me with some tool or tools to help diagnose the
cause of the problem or even where to start looking I would REALLY
appreciate it. Thank you

If the system clock is jumping backwards then unless some process is mucking with the clock, sounds like there's some kind of kernel timekeeping problem on that platform..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/