Okay, I was about to ask, but is not calling get_user() for all data
read page faults increase the cost for a hot code path in general for
some potential savings for a very specific use case. Not sure if that
is worth the trade-off.
The instruction is cache hot since it must be present in the cpu cache for the fault. So the overhead is minimal.
But could not a pagefault_disable()-enable() window prevent concurring
page faults for the current process thus degrading its performance.