Re: [PATCH/RFC 12/16] perf tools: Reduce lock contention when processing events

From: Namhyung Kim
Date: Mon Dec 14 2015 - 21:04:18 EST


Hi Jiri,

On Mon, Dec 14, 2015 at 09:43:04AM +0100, Jiri Olsa wrote:
> On Thu, Dec 10, 2015 at 04:53:31PM +0900, Namhyung Kim wrote:
> > When multi-thread is enabled, the machine->threads_lock is contented
> > as all worker threads try to grab the writer lock using the
> > machine__findnew_thread(). Usually, the thread they're looking for is
> > in the tree so they only need the reader lock though.
> >
> > Thus try machine__find_thread() first, and then fallback to the
> > 'findnew' API. This will improve the performance.
>
> found one other place, but I guess you chose those
> based on profiling the contention?

I only profiled my earlier perf report patchset and found
machine__findnew_thread() in perf_event__preprocess_sample() was
contended. But perf-report serialized meta events and only process
sample events parallelly.

As perf-top processes all events parallelly, I thought I should use
the same technique for other places.

>
> parent lookup in machine__process_fork_event

I'll add that too.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/