Re: [PATCH v2 3/3] mm/selftest: uffd: Explain the write missing fault check

From: Mike Kravetz
Date: Mon Oct 03 2022 - 22:38:01 EST


On 10/03/22 20:37, Peter Xu wrote:
> It's not obvious why we had a write check for each of the missing messages,
> especially when it should be a locking op. Add a rich comment for that,
> and also try to explain its good side and limitations, so that if someone
> hit it again for either a bug or a different glibc impl there'll be some
> clue to start with.

Thanks! It did take a while to understand all this, so the comment is
appropriate.

>
> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> ---
> tools/testing/selftests/vm/userfaultfd.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)

Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
--
Mike Kravetz

>
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 74babdbc02e5..297f250c1d95 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -774,7 +774,27 @@ static void uffd_handle_page_fault(struct uffd_msg *msg,
> continue_range(uffd, msg->arg.pagefault.address, page_size);
> stats->minor_faults++;
> } else {
> - /* Missing page faults */
> + /*
> + * Missing page faults.
> + *
> + * Here we force a write check for each of the missing mode
> + * faults. It's guaranteed because the only threads that
> + * will trigger uffd faults are the locking threads, and
> + * their first instruction to touch the missing page will
> + * always be pthread_mutex_lock().
> + *
> + * Note that here we relied on an NPTL glibc impl detail to
> + * always read the lock type at the entry of the lock op
> + * (pthread_mutex_t.__data.__type, offset 0x10) before
> + * doing any locking operations to guarantee that. It's
> + * actually not good to rely on this impl detail because
> + * logically a pthread-compatible lib can implement the
> + * locks without types and we can fail when linking with
> + * them. However since we used to find bugs with this
> + * strict check we still keep it around. Hopefully this
> + * could be a good hint when it fails again. If one day
> + * it'll break on some other impl of glibc we'll revisit.
> + */
> if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE)
> err("unexpected write fault");
>
> --
> 2.37.3
>