RE: [PATCH 1/2] perf mmap: Fix perf backward recording
From: Liang, Kan
Date: Wed Nov 01 2017 - 09:57:57 EST
> On 2017/11/1 20:00, Namhyung Kim wrote:
> > On Wed, Nov 01, 2017 at 06:32:50PM +0800, Wangnan (F) wrote:
> >>
> >> On 2017/11/1 17:49, Namhyung Kim wrote:
> >>> Hi,
> >>>
> >>> On Wed, Nov 01, 2017 at 05:53:26AM +0000, Wang Nan wrote:
> >>>> perf record backward recording doesn't work as we expected: it never
> >>>> overwrite when ring buffer full.
> >>>>
> >>>> Test:
> >>>>
> >>>> (Run a busy printing python task background like this:
> >>>>
> >>>> while True:
> >>>> print 123
> >>>>
> >>>> send SIGUSR2 to perf to capture snapshot.)
> >> [SNIP]
> >>
> >>>> Signed-off-by: Wang Nan <wangnan0@xxxxxxxxxx>
> >>>> ---
> >>>> tools/perf/util/evlist.c | 8 +++++++-
> >>>> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> >>>> index c6c891e..4c5daba 100644
> >>>> --- a/tools/perf/util/evlist.c
> >>>> +++ b/tools/perf/util/evlist.c
> >>>> @@ -799,22 +799,28 @@ perf_evlist__should_poll(struct perf_evlist
> *evlist __maybe_unused,
> >>>> }
> >>>> static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int
> idx,
> >>>> - struct mmap_params *mp, int cpu_idx,
> >>>> + struct mmap_params *_mp, int cpu_idx,
> >>>> int thread, int *_output, int
> *_output_backward)
> >>>> {
> >>>> struct perf_evsel *evsel;
> >>>> int revent;
> >>>> int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
> >>>> + struct mmap_params *mp;
> >>>> evlist__for_each_entry(evlist, evsel) {
> >>>> struct perf_mmap *maps = evlist->mmap;
> >>>> + struct mmap_params rdonly_mp;
> >>>> int *output = _output;
> >>>> int fd;
> >>>> int cpu;
> >>>> + mp = _mp;
> >>>> if (evsel->attr.write_backward) {
> >>>> output = _output_backward;
> >>>> maps = evlist->backward_mmap;
> >>>> + rdonly_mp = *_mp;
> >>>> + rdonly_mp.prot &= ~PROT_WRITE;
> >>>> + mp = &rdonly_mp;
> >>>> if (!maps) {
> >>>> maps = perf_evlist__alloc_mmap(evlist);
> >>>> --
> >>> What about this instead (not tested)?
> >>>
> >>>
> >>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> >>> index c6c891e154a6..27ebe355e794 100644
> >>> --- a/tools/perf/util/evlist.c
> >>> +++ b/tools/perf/util/evlist.c
> >>> @@ -838,6 +838,11 @@ static int perf_evlist__mmap_per_evsel(struct
> perf_evlist *evlist, int idx,
> >>> if (*output == -1) {
> >>> *output = fd;
> >>> + if (evsel->attr.write_backward)
> >>> + mp->prot = PROT_READ;
> >>> + else
> >>> + mp->prot = PROT_READ | PROT_WRITE;
> >>> +
> >> If evlist->overwrite is true, PROT_WRITE should be unset even if
> >> write_backward is
> >> not set. If you want to delay the setting of mp->prot, you need to consider
> >> both evlist->overwrite and evsel->attr.write_backward.
> > I thought evsel->attr.write_backward should be set when
> > evlist->overwrite is set. Do you mean following case?
> >
> > perf record --overwrite -e 'cycles/no-overwrite/'
> >
>
> No. evlist->overwrite is unrelated to '--overwrite'. This is why I
> said the concept of 'overwrite' and 'backward' is ambiguous.
>
Yes, I think we should make it clear.
As we discussed previously, there are four possible combinations
to access ring buffer , 'forward non-overwrite', 'forward overwrite',
'backward non-overwrite' and 'backward overwrite'.
Actually, not all of the combinations are necessary.
- 'forward overwrite' mode brings some problems which were mentioned
in commit ID 9ecda41acb97 ("perf/core: Add ::write_backward attribute
to perf event").
- 'backward non-overwrite' mode is very similar as 'forward non-overwrite'.
There is no extra advantage. Only keep one non-overwrite mode is enough.
So 'forward non-overwrite' and 'backward overwrite' are enough for all perf tools.
Furthermore, 'forward' and 'backward' only indicate the direction of the
ring buffer. They don't impact the result and performance. It is not
important as the concept of overwrite/non-overwrite.
To simplify the concept, only 'non-overwrite' mode and 'overwrite' mode should
be kept. 'non-overwrite' mode indicates the forward ring buffer. 'overwrite' mode
indicates the backward ring buffer.
> perf record never sets 'evlist->overwrite'. What '--overwrite' actually
> does is setting write_backward. Some testcases needs overwrite evlist.
>
There are only four test cases which set overwrite, sw-clock,task-exit,
mmap-basic, backward-ring-buffer.
Only backward-ring-buffer is 'backward overwrite'.
The rest three are all 'forward overwrite'. We just need to set write_backward
to convert them to 'backward overwrite'.
I think it's not hard to clean up.
Thanks,
Kan