Re: [PATCH] fs/address_space: move i_mmap_rwsem to mitigate a false sharing with i_mmap.

From: Matthew Wilcox
Date: Mon Feb 05 2024 - 18:08:31 EST


On Mon, Feb 05, 2024 at 02:22:29PM +0800, JonasZhou wrote:
> When running UnixBench/execl, each execl process repeatedly performs
> i_mmap_lock_write -> vma_interval_tree_remove/insert ->
> i_mmap_unlock_write. As indicated below, when i_mmap and i_mmap_rwsem
> are in the same CACHE Line, there will be more HITM.

(I wasn't familiar with the term HITM. For anyone else who's
unfamiliar, this appears to mean a HIT in another core's cache, which
has the cachline in the Modified state)

> Func0: i_mmap_lock_write
> Func1: vma_interval_tree_remove/insert
> Func2: i_mmap_unlock_write
> In the same CACHE Line
> Process A | Process B | Process C | Process D | CACHE Line state
> ----------+-----------+-----------+-----------+-----------------
> Func0 | | | | I->M
> | Func0 | | | HITM M->S
> Func1 | | | | may change to M
> | | Func0 | | HITM M->S
> Func2 | | | | S->M
> | | | Func0 | HITM M->S
>
> In different CACHE Lines
> Process A | Process B | Process C | Process D | CACHE Line state
> ----------+-----------+-----------+-----------+-----------------
> Func0 | | | | I->M
> | Func0 | | | HITM M->S
> Func1 | | | |
> | | Func0 | | S->S
> Func2 | | | | S->M
> | | | Func0 | HITM M->S
>
> The same issue will occur in Unixbench/shell because the shell
> launches a lot of shell commands, loads executable files and dynamic
> libraries into memory, execute, and exit.

OK, I see.

> Yes, his commit has been merged into the Linux kernel, but there
> is an issue. After moving i_mmap_rwsem below flags, there is a
> 32-byte gap between i_mmap_rwsem and i_mmap. However, the struct
> address_space is aligned to sizeof(long), which is 8 on the x86-64
> architecture. As a result, i_mmap_rwsem and i_mmap may be placed on
> the same CACHE Line, causing a false sharing problem. This issue has
> been observed using the perf c2c tool.

Got it. OK, let's put this patch in. It's a stopgap measure, clearly.
I'll reply to Dave's email with a longer term solution.