Re: [PATCH] ring-buffer: Fix slowpath of interrupted event

From: Steven Rostedt
Date: Tue Dec 19 2023 - 10:09:50 EST


On Tue, 19 Dec 2023 23:37:10 +0900
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> wrote:

> Yeah the above works, but my question is, do we really need this
> really slow path? I mean;
>
> > if (w == write - event length) {
> > /* Nothing interrupted between A and C */
> > /*E*/ write_stamp = ts;
> > delta = ts - after
>
> } else {
> /*
> Something interrupted between A and C, which should record
> a new entry before this reserved entry with newer timestamp.
> we reuse it.
> */
> ts = after = write_stamp;
> delta = 0;
> }
>
> Isn't this enough?

I really like to avoid: delta = 0 when possible. It's basically what I do
when I have no other options. Why?

Because let's just say you are looking at the time of interrupt events. If
you just trace the entry of the interrupt, and that interrupt interrupted
an event being written to, we have this:

Time starts at ts 1000, and we are able to calculate the delta of the
interrupted event. And the trace will have:

1000 - interrupt event
2000 - normal context event
2100 - next normal context event

Where we see the delta between the interrupt event and the normal context
event was 1000. But if we just had it be delta = 0, it would be:

1000 - interrupt event
1000 - normal context event
2100 - next normal context event

It will look like the time between the interrupt event and the normal
context event was instant, and the time between the normal context event
and the next normal context event was 1100 when in reality it was just 100.

The above scenario is rather common. Perhaps it happens 1% of the time. The
case where we currently have delta = 0 only happens when the same event
gets interrupted twice. That is, two separate interrupts came in, one
before it allocated its space on the buffer, and one after it allocated.
That's a much more race occurrence (0.01% possibly, or less). In fact my
traces seldom even show it. Most of the time, even when doing function
tracing, I have a hard time seeing this rare race.

So, if we can have delta=0 only 0.01% or less of the time, I rather do that
then have it be delta=0 1% of the time.

Thanks for the review!

-- Steve