On 22.03.24 22:18, David Hildenbrand wrote:
On 22.03.24 22:13, Miklos Szeredi wrote:
On Fri, 22 Mar 2024 at 22:08, David Hildenbrand <david@xxxxxxxxxx> wrote:
On 22.03.24 20:46, Miklos Szeredi wrote:
On Fri, 22 Mar 2024 at 16:41, David Hildenbrand <david@xxxxxxxxxx> wrote:
But at least the vmsplice() just seems to work. Which is weird, because
GUP-fast should not apply (page not faulted in?)
But it is faulted in, and that indeed seems to be the root cause.
secretmem mmap() won't populate the page tables. So it's not faulted in yet.
When we GUP via vmsplice, GUP-fast should not find it in the page tables
and fallback to slow GUP.
There, we seem to pass check_vma_flags(), trigger faultin_page() to
fault it in, and then find it via follow_page_mask().
... and I wonder how we manage to skip check_vma_flags(), or otherwise
managed to GUP it.
vmsplice() should, in theory, never succeed here.
Weird :/
Improved repro:
#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/syscall.h>
int main(void)
{
int fd1, fd2;
int pip[2];
struct iovec iov;
char *addr;
int ret;
fd1 = syscall(__NR_memfd_secret, 0);
addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd1, 0);
ftruncate(fd1, 7);
addr[0] = 1; /* fault in page */
Here the page is faulted in and GUP-fast will find it. It's not in
the kernel page table, but it is in the user page table, which is what
matter for GUP.
Trust me, I know the GUP code very well :P
gup_pte_range -- GUP fast -- contains:
if (unlikely(folio_is_secretmem(folio))) {
gup_put_folio(folio, 1, flags);
goto pte_unmap;
}
So we "should" be rejecting any secretmem folios and fallback to GUP slow.
... we don't check the same in gup_huge_pmd(), but we shouldn't ever see
THP in secretmem code.
Ehm:
[ 29.441405] Secretmem fault: PFN: 1096177
[ 29.442092] GUP-fast: PFN: 1096177
... is folio_is_secretmem() broken?
... is it something "obvious" like:
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 35f3a4a8ceb1e..6996f1f53f147 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -16,7 +16,7 @@ static inline bool folio_is_secretmem(struct folio *folio)
* We know that secretmem pages are not compound and LRU so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio) || folio_test_lru(folio))
return false;
mapping = (struct address_space *)