On Fri, Oct 28, 2022 at 06:07:34PM -0400, Mathieu Desnoyers wrote:
On 2022-10-27 18:40, Beau Belgrave wrote:
I'm still slightly unsure about using "uint32_t" for enable check, or going
for "unsigned long". The core question there is whether a 32-bit word test
would cause partial register stalls on 64-bit architectures. Going for
unsigned long would require that user events receives information about the
bitness of the word as input from userspace. (bit=63 rather than 31).
Perhaps this is something the user events ABI should accommodate by
reserving more than 5 bits to express the target bit ?
Yeah, I thought about this. From the user side you can actually use any
arbitrary bits you want by passing a 32-bit aligned value. So you could
take the lower or top half of a 64-bit value. The reason I limit to 0-31
bits is to ensure no page straddling occurs. Futex also does this, check
out get_futex_key() in kernel/futex/core.c.
I would recommend trying out uint64 but give the upper half to
user_events ABI and checking if that works for you if you want say 63
bits to user space. You'd tell the ABI bit 31, but in user space it
would be bit 63.
The enable address is disabled while the async fault-in occurs. This
ensures that we don't attempt fault-in more than is necessary. Once the
page has been faulted in, the address write is attempted again. If the
page couldn't fault-in, then we wait until the next time the event is
enabled to prevent any potential infinite loops.
So if the page is removed from the page cache between the point where it is
faulted in and the moment the write is re-attempted, that will not trigger
another attempt at paging in the page, am I correct ?
If we fault and the fixup user fault fails still with retries, then we
give up until the next enablement of that site.
However, if we fault and the fixup user fault with retries works, and
then we fault again trying to write, that will retry.
I would think this is unexpected. I would expect that failing to fault in
the page would stop any further attempts, but simply failing to pin the page
after faulting it in should simply try again.
The issue I repro is mmap() register the enable at that location, then
unmap(). All the faulting errors just tell you -EFAULT in this case,
even though it was maliciously removed. In this case you could get the
kernel to infinite loop trying to page it in.
Thoughts ?
We need to have some sanity toward giving up faulting in data that never
will make it. The code currently assigns that line to if
fixup_user_fault with retries fails, we give up. pin_user_pages_remote
will not stop it from being attempted again.
Thanks,
Mathieu
Thanks,
-Beau