Re: [PATCHES/RFC] Re: A concern about overflow ring buffer mode

From: Liang, Kan
Date: Mon Oct 29 2018 - 11:11:37 EST




On 10/29/2018 10:35 AM, Arnaldo Carvalho de Melo wrote:
Em Mon, Oct 29, 2018 at 10:33:06AM -0400, Liang, Kan escreveu:
On 10/29/2018 9:03 AM, Arnaldo Carvalho de Melo wrote:
Em Fri, Oct 26, 2018 at 04:11:51PM -0400, Liang, Kan escreveu:
On 10/26/2018 3:24 PM, Arnaldo Carvalho de Melo wrote:
Em Fri, Oct 26, 2018 at 03:16:29PM -0400, Liang, Kan escreveu:
It is mainly for performance reason to switch to overwrite mode. The impact
was very small when I did my test. But now the effect is easily noticeable
in other tests. Yes, I agree. We may change it back to non-overwrite mode
until the issue is addressed.

So, I have these two patches in my perf/core branch, with Fixes tags
that will make them get to the stable kernels, ok?
I just realized that the problem in KNL will be back if we switch back to
non-overwrite mode.
The problem is that users have to wait tens of minutes to see perf top
results on the screen in KNL. Before that, there is nothing but a black
screen.
Sorry I didn't notice it last Friday. Because I thought the ui_warning in
perf_top__mmap_read() can give user a hint. So the user can switch to
overwrite mode manually.
But unfortunately, the ui_warning doesn't work. Because it is called after
perf_top__mmap_read(). The processing time of perf_top__mmap_read() could be
tens of minutes.

So we need a way to notice that we're in a machine like that and warn
the user before the wait takes place, ideas on how to do that?

The processing time for each perf_top__mmap_read_idx() should not that long. We may check it after each perf_top__mmap_read_idx(). Also change the ui_warning to one-time warning. The patch as below can do that (not test).


diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d21d875..5e532e0 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -877,31 +877,40 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
perf_mmap__read_done(md);
}

+static bool check_processing_time = true;
+
static void perf_top__mmap_read(struct perf_top *top)
{
bool overwrite = top->record_opts.overwrite;
struct perf_evlist *evlist = top->evlist;
- unsigned long long start, end;
+ unsigned long long start, end, tolerance;
int i;

- start = rdclock();
if (overwrite)
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_DATA_PENDING);

- for (i = 0; i < top->evlist->nr_mmaps; i++)
+ tolerance = (unsigned long long)top->delay_secs * NSEC_PER_SEC / top->evlist->nr_mmaps;
+ start = rdclock();
+ for (i = 0; i < top->evlist->nr_mmaps; i++) {
perf_top__mmap_read_idx(top, i);
+ if (check_processing_time) {
+ end = rdclock();
+
+ if ((end - start) > tolerance) {
+ ui__warning("Too slow to read ring buffer.\n"
+ "Please try increasing the period (-c) or\n"
+ "decreasing the freq (-F) or\n"
+ "limiting the number of CPUs (-C)\n");
+ check_processing_time = false;
+ }
+ start = end;
+ }
+ }

if (overwrite) {
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
}
- end = rdclock();
-
- if ((end - start) > (unsigned long long)top->delay_secs * NSEC_PER_SEC)
- ui__warning("Too slow to read ring buffer.\n"
- "Please try increasing the period (-c) or\n"
- "decreasing the freq (-F) or\n"
- "limiting the number of CPUs (-C)\n");
}

/*