Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.

From: David Hildenbrand
Date: Mon Nov 13 2023 - 09:43:39 EST


On 13.11.23 14:26, Theodore Ts'o wrote:
On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:

According to the man page:

"The memory areas backing the file created with memfd_secret(2) are visible
only to the processes that have access to the file descriptor. The memory
region is removed from the kernel page tables and only the page tables of
the processes holding the file descriptor map the corresponding physical
memory. (Thus, the pages in the region can't be accessed by the kernel
itself, so that, for example, pointers to the region can't be passed to
system calls.)

I'm not sure if the last part is actually true, if the syscalls end up
walking user page tables to copy data in/out.

The idea behind removing it from the kernel page tables is so that
kernel code running in some other process context won't be able to
reference the memory via the kernel address space. (So if there is
some kind of kernel zero-day which allows arbitrary code execution,
the injected attack code would have to play games with page tables
before being able to reference the memory --- this is not
*impossible*, just more annoying.)

But if you are doing a buffered write, the copy from the user-supplied
buffer to the page cache is happening in the process's context. So
"foreground kernel code" can dereference the user-supplied pointer
just fine.

Right, so the statement in the man page is imprecise.

--
Cheers,

David / dhildenb