Re: BUG: unable to handle kernel paging request in fuse_copy_do

From: David Hildenbrand
Date: Fri Mar 22 2024 - 17:56:26 EST


On 22.03.24 22:37, David Hildenbrand wrote:
On 22.03.24 22:33, David Hildenbrand wrote:
On 22.03.24 22:18, David Hildenbrand wrote:
On 22.03.24 22:13, Miklos Szeredi wrote:
On Fri, 22 Mar 2024 at 22:08, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 22.03.24 20:46, Miklos Szeredi wrote:
On Fri, 22 Mar 2024 at 16:41, David Hildenbrand <david@xxxxxxxxxx> wrote:

But at least the vmsplice() just seems to work. Which is weird, because
GUP-fast should not apply (page not faulted in?)

But it is faulted in, and that indeed seems to be the root cause.

secretmem mmap() won't populate the page tables. So it's not faulted in yet.

When we GUP via vmsplice, GUP-fast should not find it in the page tables
and fallback to slow GUP.

There, we seem to pass check_vma_flags(), trigger faultin_page() to
fault it in, and then find it via follow_page_mask().

... and I wonder how we manage to skip check_vma_flags(), or otherwise
managed to GUP it.

vmsplice() should, in theory, never succeed here.

Weird :/

Improved repro:

#define _GNU_SOURCE

#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/syscall.h>

int main(void)
{
int fd1, fd2;
int pip[2];
struct iovec iov;
char *addr;
int ret;

fd1 = syscall(__NR_memfd_secret, 0);
addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd1, 0);
ftruncate(fd1, 7);
addr[0] = 1; /* fault in page */

Here the page is faulted in and GUP-fast will find it. It's not in
the kernel page table, but it is in the user page table, which is what
matter for GUP.

Trust me, I know the GUP code very well :P

gup_pte_range -- GUP fast -- contains:

if (unlikely(folio_is_secretmem(folio))) {
gup_put_folio(folio, 1, flags);
goto pte_unmap;
}

So we "should" be rejecting any secretmem folios and fallback to GUP slow.


... we don't check the same in gup_huge_pmd(), but we shouldn't ever see
THP in secretmem code.


Ehm:

[ 29.441405] Secretmem fault: PFN: 1096177
[ 29.442092] GUP-fast: PFN: 1096177


... is folio_is_secretmem() broken?

... is it something "obvious" like:

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 35f3a4a8ceb1e..6996f1f53f147 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -16,7 +16,7 @@ static inline bool folio_is_secretmem(struct folio *folio)
* We know that secretmem pages are not compound and LRU so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio) || folio_test_lru(folio))
return false;
mapping = (struct address_space *)

... yes, that does the trick!


Proper patch (I might send out again on Monday "officially"). There are
other improvements we want to do to folio_is_secretmem() in the light of
folio_fast_pin_allowed(), that I wanted to do a while ago. I might send
a patch for that as well now that I'm at it.


From 85558a46d9f249f26bd77dd3b18d14f248464845 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@xxxxxxxxxx>
Date: Fri, 22 Mar 2024 22:45:36 +0100
Subject: [PATCH] mm/secretmem: fix GUP-fast succeeding on secretmem folios

folio_is_secretmem() states that secretmem folios cannot be LRU folios:
so we may only exit early if we find an LRU folio. Yet, we exit early if
we find a folio that is not a secretmem folio.

Consequently, folio_is_secretmem() fails to detect secretmem folios and,
therefore, we can succeed in grabbing a secretmem folio during GUP-fast,
crashing the kernel when we later try reading/writing to the folio, because
the folio has been unmapped from the directmap.

Reported-by: xingwei lee <xrivendell7@xxxxxxxxx>
Reported-by: yue sun <samsun1006219@xxxxxxxxx>
Debugged-by: Miklos Szeredi <miklos@xxxxxxxxxx>
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
include/linux/secretmem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 35f3a4a8ceb1..6996f1f53f14 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -16,7 +16,7 @@ static inline bool folio_is_secretmem(struct folio *folio)
* We know that secretmem pages are not compound and LRU so we can
* save a couple of cycles here.
*/
- if (folio_test_large(folio) || !folio_test_lru(folio))
+ if (folio_test_large(folio) || folio_test_lru(folio))
return false;
mapping = (struct address_space *)
--
2.43.2


--
Cheers,

David / dhildenb