Re: [syzbot] WARNING in hugetlb_wp

From: David Hildenbrand
Date: Mon Oct 31 2022 - 10:14:35 EST

Next message: Peter Xu: "Re: [PATCH 0/2] mm: Use pte marker for swapin errors"
Previous message: Camelia Alexandra Groza: "RE: [PATCH net-next v7 08/10] powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G"
In reply to: David Hildenbrand: "Re: [syzbot] WARNING in hugetlb_wp"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 30.10.22 09:53, David Hildenbrand wrote:

On 30.10.22 02:35, Mike Kravetz wrote:

On 10/28/22 22:15, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit: 247f34f7b803 Linux 6.1-rc2
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14a9efd2880000
kernel config: https://syzkaller.appspot.com/x/.config?x=a66c6c673fb555e8
dashboard link: https://syzkaller.appspot.com/bug?extid=f0b97304ef90f0d0b1dc
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=112217b4880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105f4616880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/de212436b09b/disk-247f34f7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/63c4feda220f/vmlinux-247f34f7.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f0b97304ef90f0d0b1dc@xxxxxxxxxxxxxxxxxxxxxxxxx

------------[ cut here ]------------
WARNING: CPU: 1 PID: 3612 at mm/hugetlb.c:5313 hugetlb_wp+0x20a/0x1af0 mm/hugetlb.c:5313

This warning was added with commit 1d8d14641fd94 ("mm/hugetlb: support write-faults
in shared mappings"). It is there 'by design' as this this specific
type of write fault is not supported.

/*
* hugetlb does not support FOLL_FORCE-style write faults that keep the
* PTE mapped R/O such as maybe_mkwrite() would do.
*/
if (WARN_ON_ONCE(!unshare && !(vma->vm_flags & VM_WRITE)))
return VM_FAULT_SIGSEGV;

If there is an actual use case for this support, we can look at adding it.

Right, it's by design and in retrospective it was the right approach to add this
check there. We seem to have a way to trigger a hugetlb write
fault without VM_WRITE set from GUP. We have to fence that.

Interestingly, I thought I tried to trigger that exact scenario.

a) Have a MAP_PRIVATE, PROT_READ hugetlb mapping
b) Try writing to it via /proc/self/mem, triggering debug access with FOLL_FORCE

The expectation is that this will fail with -EFAULT on hugetlb. I could have
sworn that it did the right thing when I tried :)

But staring at follow_hugetlb_page(), I think we will end up triggering a
write fault (FAULT_FLAG_WRITE) on hugetlb.

The easiest fix might be to special-case hugetlb VMA in check_vma_flags():

From 39d2a525cae62e7d2766ecfc4337ed4d59555d9d Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@xxxxxxxxxx>
Date: Sun, 30 Oct 2022 09:45:50 +0100
Subject: [PATCH] mm/gup: disallow FOLL_FORCE on hugetlb mappings

TODO

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
mm/gup.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index fe195d47de74..b934687efdec 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1076,6 +1076,9 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
*/
if (!is_cow_mapping(vm_flags))
return -EFAULT;
+ /* hugetlb does not support FOLL_FORCE. */
+ if (is_vm_hugetlb_page(vma))
+ return -EFAULT;
}
} else if (!(vm_flags & VM_READ)) {
if (!(gup_flags & FOLL_FORCE))

A simple reproducer:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <stdint.h>
#include <sys/mman.h>
#include <linux/mman.h>

int main(int argc, char **argv)
{
char *map;
int mem_fd;

map = mmap(NULL, 2 * 1024 * 1024u, PROT_READ,
MAP_PRIVATE|MAP_ANON|MAP_HUGETLB|MAP_HUGE_2MB, -1, 0);
if (map == MAP_FAILED) {
fprintf(stderr, "mmap() failed: %d\n", errno);
return 1;
}

mem_fd = open("/proc/self/mem", O_RDWR);
if (mem_fd < 0) {
fprintf(stderr, "open(/proc/self/mem) failed: %d\n", errno);
return 1;
}

if (pwrite(mem_fd, "0", 1, (uintptr_t) map) == 1) {
fprintf(stderr, "write() succeeded, which is unexpected\n");
return 1;
}

printf("write() failed as expected: %d\n", errno);
return 0;
}

I started looking at the follow_hugetlb_page() call in __get_user_pages() and
my head seriously started to hurt.

Why are we storing to "i" and error from follow_hugetlb_page()->hugetlb_fault()
and then eventually happily continuing?

I'm afraid of touching that code, it looks too fragile.

Hopefully I am missing something important and it's all perfectly
fine code.

--
Thanks,

David / dhildenb

Next message: Peter Xu: "Re: [PATCH 0/2] mm: Use pte marker for swapin errors"
Previous message: Camelia Alexandra Groza: "RE: [PATCH net-next v7 08/10] powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G"
In reply to: David Hildenbrand: "Re: [syzbot] WARNING in hugetlb_wp"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]