Re: [PATCH] uprobes: use vm_special_mapping close() functionality

From: Oleg Nesterov
Date: Wed Sep 04 2024 - 05:57:03 EST


I didn't have time to write a full reply yesterday.

On 09/03, Linus Torvalds wrote:
>
> On Tue, 3 Sept 2024 at 02:09, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > but with or without this fix __create_xol_area() also needs
> >
> > area->xol_mapping.mremap = NULL;
>
> I think the whole thing needs to be zeroed out.

Again, this is what Sven did and I agree with this fix for now.

> It was always horribly buggy. The close thing just made it more
> *obviously* buggy, because closing a vma is a lot more common than
> mremap'ing it.

Well, this code is very old, I don't think it was "always" buggy.
But at least today it is horribly ugly, and

> Either use kzalloc(), or do a proper initializer something like this:
>
> - area->xol_mapping.name = "[uprobes]";
> - area->xol_mapping.fault = NULL;
> - area->xol_mapping.pages = area->pages;
> + area->xol_mapping = (struct vm_special_mapping) {
> + .name = "[uprobes]",
> + .pages = area->pages,
> + .close = uprobe_clear_state,
> + };

either way the code is still ugly, imo.

How about the (untested) patch below?

I am not going to send this patch right now, it conflicts with the ongoing
close/mremap changes, but what do you think?

We do not need to add the instance of vm_special_mapping into mm_struct,
a single instance (xol_mapping below) with .fault = xol_fault() is enough.
And, with this change xol_area no longer needs "struct page *pages[2]", it
can be turned into "struct page *page".

Oleg.
---

--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -101,7 +101,6 @@ struct xol_area {
atomic_t slot_count; /* number of in-use slots */
unsigned long *bitmap; /* 0 = free slot */

- struct vm_special_mapping xol_mapping;
struct page *pages[2];
/*
* We keep the vma's vm_start rather than a pointer to the vma
@@ -1453,6 +1452,21 @@ void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned lon
set_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags);
}

+static vm_fault_t xol_fault(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+ struct xol_area *area = vma->vm_mm->uprobes_state.xol_area;
+
+ vmf->page = area->pages[0];
+ get_page(vmf->page);
+ return 0;
+}
+
+static const struct vm_special_mapping xol_mapping = {
+ .name = "[uprobes]",
+ .fault = xol_fault,
+};
+
/* Slot allocation for XOL */
static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
{
@@ -1479,7 +1493,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)

vma = _install_special_mapping(mm, area->vaddr, PAGE_SIZE,
VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO,
- &area->xol_mapping);
+ &xol_mapping);
if (IS_ERR(vma)) {
ret = PTR_ERR(vma);
goto fail;
@@ -1518,9 +1532,6 @@ static struct xol_area *__create_xol_area(unsigned long vaddr)
if (!area->bitmap)
goto free_area;

- area->xol_mapping.name = "[uprobes]";
- area->xol_mapping.fault = NULL;
- area->xol_mapping.pages = area->pages;
area->pages[0] = alloc_page(GFP_HIGHUSER);
if (!area->pages[0])
goto free_bitmap;