Re: [PATCH 4.4 014/134] perf tools: Make perf_event__synthesize_mmap_events() scale
From: Greg Kroah-Hartman
Date: Fri Apr 06 2018 - 03:03:23 EST
On Thu, Mar 29, 2018 at 05:13:56PM +0100, Ben Hutchings wrote:
> On Mon, 2018-03-19 at 19:04 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Stephane Eranian <eranian@xxxxxxxxxx>
> >
> >
> > [ Upstream commit 88b897a30c525c2eee6e7f16e1e8d0f18830845e ]
> >
> > This patch significantly improves the execution time of
> > perf_event__synthesize_mmap_events() when running perf record on systems
> > where processes have lots of threads.
> >
> > It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
> > generate each map line in the maps file. If you have 1000 threads, then you
> > have necessarily 1000 stacks. For each vma, you need to check if it
> > corresponds to a thread's stack. With a large number of threads, this can take
> > a very long time. I have seen latencies >> 10mn.
> >
> > As of today, perf does not use the fact that a mapping is a stack, therefore we
> > can work around the issue by using /proc/pid/tasks/pid/maps. This entry does
> > not try to map a vma to stack and is thus much faster with no loss of
> > functonality.
> >
> > The proc-map-timeout logic is kept in case users still want some upper limit.
> >
> > In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
> > /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this.
> >
> > Committer note:
> >
> > This problem seems to have been elliminated in the kernel since commit :
> > b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").
> [...]
>
> I don't think so. It looks like this was fixed by commit 65376df58217
> ("proc: revert /proc/<pid>/maps [stack:TID] annotation") which we
> already have in 4.4-stable. But older branches (3.16, 3.18, 4.1) don't
> have that and probably should do.
Now added to 3.18.y
> It looks like commit b18cb64ead40 ("fs/proc: Stop trying to report
> thread stacks") is also a candidate for stable.
Now added to 3.18.y and 4.4.y, thanks!
greg k-h