Re: [PATCH] arm64: Increase the max granular size

From: Tirumalesh Chalamarla
Date: Fri Sep 25 2015 - 14:04:07 EST

On 09/25/2015 07:45 AM, Robert Richter wrote:

On 22.09.15 19:29:02, Will Deacon wrote:
On Tue, Sep 22, 2015 at 06:59:48PM +0100, Robert Richter wrote:
From: Tirumalesh Chalamarla <tchalamarla@xxxxxxxxxx>

Increase the standard cacheline size to avoid having locks in the same

Cavium's ThunderX core implements cache lines of 128 byte size. With
current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
share the same cache line leading a performance degradation.
Increasing the size fixes that.
Do you have an example of that happening?
I did some 'poor man's kernel build all modules benchmarking' and
could not find significant performance improvements so far (second
part with the patch reverted):

build-allmodules-4.2.0-01404-g5818d6e89783.log:real 7m10.490s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 6m59.747s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 6m59.264s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 7m0.435s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 6m59.569s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 6m59.274s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 7m0.507s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 7m1.551s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 6m59.073s
build-allmodules-4.2.0-01404-g5818d6e89783.log:real 7m1.738s

build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m10.644s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 6m59.814s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m0.315s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 6m59.610s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 6m59.885s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 6m59.281s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m0.869s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m0.953s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m0.787s
build-allmodules-4.2.0-01406-g638c69fddc40.log:real 7m0.656s

I will check what kind of workloads this patch was written for.
Tirumalesh, any idea?

mainly for workloads where compiler optimizes based on cache line size,
let me write a small bench mark


Increasing the size has no negative impact to cache invalidation on
systems with a smaller cache line. There is an impact on memory usage,
but that's not too important for arm64 use cases.
Do you have any before/after numbers to show the impact of this change
on other supported SoCs?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at