[PATCH] mm: remove an avoidable load of page refcount in page_ref_add_unless
From: Mateusz Guzik
Date: Sat Dec 07 2024 - 03:29:52 EST
Explicitly pre-checking the count adds nothing as atomic_add_unless
starts with doing the same thing. iow no functional changes.
disasm of stock filemap_get_read_batch from perf top while running
readseek2_processes -t 24:
0.04 │ cb: mov 0x34(%rbx),%eax # first load
73.11 │ test %eax,%eax
│ ↓ je 1bd
0.09 │ mov 0x34(%rbx),%eax # second load
1.01 │ d9: test %eax,%eax
│ ↓ je 1bd
0.06 │ lea 0x1(%rax),%edx
0.00 │ lea 0x34(%rbx),%r14
0.00 │ lock cmpxchg %edx,0x34(%rbx)
14.06 │ ↑ jne d9
Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>
---
I did not bother benchmarking, I don't think there is anything
warranting it for this one. fwiw it plausibly is worth few % in a
microbenchmark at higher core count.
include/linux/page_ref.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 8c236c651d1d..fa203894876f 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -234,7 +234,7 @@ static inline bool page_ref_add_unless(struct page *page, int nr, int u)
rcu_read_lock();
/* avoid writing to the vmemmap area being remapped */
- if (!page_is_fake_head(page) && page_ref_count(page) != u)
+ if (!page_is_fake_head(page))
ret = atomic_add_unless(&page->_refcount, nr, u);
rcu_read_unlock();
--
2.43.0