This does cause about a 2.7% regression for me, using O_DIRECT on a raw
block device. Looking at a perf diff, here's the top:
+2.71% [kernel.vmlinux] [k] mod_node_page_state
+2.22% [kernel.vmlinux] [k] iov_iter_extract_pages
and these two are gone:
2.14% [kernel.vmlinux] [k] __iov_iter_get_pages_alloc
1.53% [kernel.vmlinux] [k] iov_iter_get_pages
rest is mostly in the noise, but mod_node_page_state() sticks out like
a sore thumb. They seem to be caused by the node stat accounting done
in gup.c for FOLL_PIN.
Confirmed just disabling the node_stat bits in mm/gup.c and now the
performance is back to the same levels as before.
An almost 3% regression is a bit hard to swallow...