[PATCH v2 0/3] lib/list_batch: A simple list insertion/deletion batching facility

From: Waiman Long
Date: Fri Jan 29 2016 - 14:31:13 EST


v1->v2:
- Address feedbacks from PeterZ and Andi Kleen.

This patchset introduces a simple list insertion/deletion batching
facility to batch multiple list insertion and deletion operations
into a single one under one lock/unlock critical section.

Patch 1 introduces this new facility.

Patch 2 enables it for the x86 architecture.

Patch 3 makes the insertion and deletion of the VFS superblock's
inode list to use the new list batching functions.

On x86-64, the generated code (gcc v4.8.5) of the inode_sb_list_add()
function without the patch was:

0x0000000000001030 <+0>: callq 0x1035 <inode_sb_list_add+5>
0x0000000000001035 <+5>: push %rbp
0x0000000000001036 <+6>: mov %rsp,%rbp
0x0000000000001039 <+9>: push %rbx
0x000000000000103a <+10>: mov 0x28(%rdi),%rax
0x000000000000103e <+14>: mov %rdi,%rbx
0x0000000000001041 <+17>: lea 0x600(%rax),%rdi
0x0000000000001048 <+24>: callq 0x104d <inode_sb_list_add+29>
0x000000000000104d <+29>: mov 0x28(%rbx),%rsi
0x0000000000001051 <+33>: lea 0x120(%rbx),%rdi
0x0000000000001058 <+40>: mov 0x608(%rsi),%rdx
0x000000000000105f <+47>: add $0x608,%rsi
0x0000000000001066 <+54>: callq 0x106b <inode_sb_list_add+59>
0x000000000000106b <+59>: mov 0x28(%rbx),%rdi
0x000000000000106f <+63>: add $0x600,%rdi
0x0000000000001076 <+70>: callq *0x0
0x000000000000107d <+77>: pop %rbx
0x000000000000107e <+78>: pop %rbp
0x000000000000107f <+79>: retq

With the patch, the function became:

0x0000000000001030 <+0>: callq 0x1035 <inode_sb_list_add+5>
0x0000000000001035 <+5>: push %rbp
0x0000000000001036 <+6>: mov %rsp,%rbp
0x0000000000001039 <+9>: push %r13
0x000000000000103b <+11>: lea 0x120(%rdi),%r13
0x0000000000001042 <+18>: push %r12
0x0000000000001044 <+20>: push %rbx
0x0000000000001045 <+21>: mov 0x28(%rdi),%r12
0x0000000000001049 <+25>: lea 0x600(%r12),%rbx
0x0000000000001051 <+33>: mov %rbx,%rdi
0x0000000000001054 <+36>: callq 0x1059 <inode_sb_list_add+41>
0x0000000000001059 <+41>: test %eax,%eax
0x000000000000105b <+43>: je 0x1081 <inode_sb_list_add+81>
0x000000000000105d <+45>: mov 0x618(%r12),%rsi
0x0000000000001065 <+53>: mov %r13,%rdi
0x0000000000001068 <+56>: mov (%rsi),%rdx
0x000000000000106b <+59>: callq 0x1070 <inode_sb_list_add+64>
0x0000000000001070 <+64>: mov %rbx,%rdi
0x0000000000001073 <+67>: callq *0x0
0x000000000000107a <+74>: pop %rbx
0x000000000000107b <+75>: pop %r12
0x000000000000107d <+77>: pop %r13
0x000000000000107f <+79>: pop %rbp
0x0000000000001080 <+80>: retq
0x0000000000001081 <+81>: lea 0x618(%r12),%rdx
0x0000000000001089 <+89>: mov %r13,%rcx
0x000000000000108c <+92>: mov %rbx,%rsi
0x000000000000108f <+95>: xor %edi,%edi
0x0000000000001091 <+97>: callq 0x1096 <inode_sb_list_add+102>
0x0000000000001096 <+102>: jmp 0x107a <inode_sb_list_add+74>

For the fastpath, the only overheads are the testing of the return value
of the spin_trylock function and the push/pop of 2 more registers. Other
than that, they look very similar.

Waiman Long (3):
lib/list_batch: A simple list insertion/deletion batching facility
lib/list_batch, x86: Enable list insertion/deletion batching for x86
vfs: Enable list batching for the superblock's inode list

arch/x86/Kconfig | 1 +
fs/inode.c | 13 ++---
fs/super.c | 1 +
include/linux/fs.h | 2 +
include/linux/list_batch.h | 133 ++++++++++++++++++++++++++++++++++++++++++++
lib/Kconfig | 7 ++
lib/Makefile | 1 +
lib/list_batch.c | 125 +++++++++++++++++++++++++++++++++++++++++
8 files changed, 275 insertions(+), 8 deletions(-)
create mode 100644 include/linux/list_batch.h
create mode 100644 lib/list_batch.c