Re: [RFC PATCH] alispinlock: acceleration from lock integration on multi-core platform

From: Ling Ma
Date: Mon Apr 04 2016 - 23:44:14 EST

Next message: Eric Anholt: "[PATCH 1/4] irqchip: bcm2835: Avoid arch/arm-specific handle_IRQ"
Previous message: Peter Hurley: "Re: [PATCH v3] Fix OpenSSH pty regression on close"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Longman,

> with some modest increase in performance. That can be hard to justify. Maybe
> you should find other use cases that involve less changes, but still have
> noticeable performance improvement. That will make it easier to be accepted.

The attachment is for other use case with the new lock optimization.
It include two files: main.c (user space workload),
fcntl-lock-opt.patch (kernel patch on 4.3.0-rc4 version)
(The hardware platform is on Intel E5 2699 V3, 72 threads (18core *2Socket *2HT)

1. when we run a.out from main.c on original 4.3.0-rc4 version,
the average throughput from a.out is 1887592( 98% cpu cost from perf top -d1)

2. when we run a.out from main.c with the fcntl-lock-opt.patch ,
the average throughput from a.out is 5277281 (91% cpu cost from perf top -d1)

So we say the new mechanism give us about 2.79x (5277281 / 1887592) improvement.

Appreciate your comments.

Thanks
Ling

Attachment: test-lock.tar
Description: Unix tar archive

Next message: Eric Anholt: "[PATCH 1/4] irqchip: bcm2835: Avoid arch/arm-specific handle_IRQ"
Previous message: Peter Hurley: "Re: [PATCH v3] Fix OpenSSH pty regression on close"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]