Re: [PATCH] trace-cmd: use nonblocking reads for streaming

From: Josef Bacik
Date: Wed Feb 24 2016 - 13:27:34 EST


On 02/23/2016 06:17 PM, Steven Rostedt wrote:
On Thu, 17 Dec 2015 12:01:52 -0500
Josef Bacik <jbacik@xxxxxx> wrote:

I noticed while using the streaming infrastructure in trace-cmd that I was
seemingly missing events. Using other tracing methods I got these events and
record->missed_events was never being set. This is because the streaming
infrastructure uses blocking reads on the per cpu trace pipe's, which means
we'll wait for an entire pages worth of data to be ready before passing it along
to the recorder. This makes it impossible to do long term tracing that requires
coupling two different events that could occur on different CPU's, and I imagine
it has been what is screwing up my trace-cmd profile runs on our giant 40 cpu
boxes. Fix trace-cmd instead to use a nonblocking read with select to wait for
data on the pipe so we don't burn CPU unnecessarily. With this patch I'm no
longer seeing missed events in my app. Thanks,

I just want to make sure I understand what is happening here.

This wasn't trace-cmd's default code right? This was your own app. And
I'm guessing you were matching events perhaps. That is, after seeing
some event, you looked for the other event. But if that event happened
on a CPU that isn't very active, it would wait forever, as the read
was waiting for a full page?

Or is there something else.

I don't have a problem with the patch. I just want to understand the
issue.

Yup I had an app that was watching block request issue and completion events, and occasionally a completion event would happen on some mostly idle cpu, so I wouldn't get the completion request until several hours later (the app runs all the time) when we finally had a full page to read from that cpu's buffer. Thanks,

Josef