Re: [patch] Real-Time Preemption, -VP-2.6.9-rc4-mm1-U0

From: Robert Wisniewski
Date: Fri Oct 15 2004 - 09:56:24 EST


Ingo Molnar writes:
>
> * Robert Wisniewski <bob@xxxxxxxxxxxxxx> wrote:
>
> > > > cmpxchg (basically: try reserve; if fail retry; else write), with
> > > > per-cpu buffers.
> > >
> > > this still does not solve all problems related to irq entries: if the
> > > IRQ interrups the tracing code after a 'successful reserve' but before
> > > the 'else write' point, and the trace is printed/saved from an
> > > interrupt, then there will be an incomplete entry in the trace.
> >
> > That is incorrect. The system behavior needed to generate an
> > incomplete entry is far more complicated and unlikely than what you
> > describe.
>
> ah, but i'm talking about actual first-hand experience, not supposition.
> It happens quite easily with latency traces (which are saved/printed
> from IRQ entries) and it can be very annoying to analyze. My first
> tracers tried to do things without the IRQ flag, so i've seen both
> methods.

This means that other code you've written has this happen, it doesn't mean
the fundamental model is broken. Also, if what you claim is true and there
really is this contention, then it both means that 1) there are many many
other higher priority tasks in the system than the one you are trying to
trace, and 2) it's questionable whether you want to use locks.

> and lets not forget this other issue:
>
> > > also, there is the problem of timestamp atomicity: if an IRQ interrupts
> > > the tracing code and the trace timestamp is taken in the 'else' branch
> > > then a time-reversal situation can occur: the entry will have a
> > > timestamp _larger_ than the IRQ trace-entries. With cli/sti all tracing
> > > entries occur atomically: either fully or not at all.
>
>
> > > > >Also, cli/sti makes it obviously SMP-safe and is pretty
> ^^^^^^^^^^
> > > > >cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
> ^^^^^^^^^^^^^^^^^^^^^^
> > > > >because the tracer interacts with that code quite heavily.)
> > > >
> > > > No preempt_disable/enable found in the lockless logging in relayfs.
> > >
> > > it would have to do that on PREEMPT_REALTIME. The irq flag solves both
> > > the races, the predictability problem and the preemption problem nicely.
> > >
> > > Ingo
> >
> > If you do not care about performance then you're probably right, this
> > is fine. If you are concerned about the time it takes to go through
> > the sequence of code, then probably not.
>
> see the portion i highlighted above. CPU makers are busy making cli/sti
> as fast as possible. To make sure i tested it on a typical x86 box:
>
> # ./cli-latency
> CLI+STI latency: 8 cycles
>
> Since the trace entry can be filled in a constant amount of time there's
> no reason not to make use of that extra silicon that makes fast cli/sti
> possible! How many trace entries can you generate per second via
> relayfs, on a typical PC? Have you ever measured this?
>
> in fact on a modern CPU cli/sti is very likely faster than a cmpxchgl
> for the following reason: the cmpxchgl generates a read dependency on
> the cacheline which must be fetched in. A single cachemiss can cost
> _alot_ in comparison, 200 cycles easily. While in the cli/sti case we
> stream out to a new cacheline in a linear fashion which is nicely
> optimized by write-allocate cache policies in modern CPUs. No
> cachemisses on the trace buffer! Just simple streaming out of data.
>
> i challenge you to change your code to use cli/sti and compare it with
> the cmpxchgl variant doing some heavy tracing on a modern PC. Please
> just _do_ it and come back with numbers. I have done my own
> measurements.
>
> Ingo

This is an interesting analysis, do you have a paper you can point to, or
can you post the numbers from, and what the setup of the experiment was,
that you ran. Sounds interesting.

Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
bob@xxxxxxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/