Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.

From: David Wang
Date: Mon Nov 13 2023 - 10:43:51 EST




At 2023-11-13 21:26:21, "Theodore Ts'o" <tytso@xxxxxxx> wrote:
>On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>>
>> According to the man page:
>>
>> "The memory areas backing the file created with memfd_secret(2) are visible
>> only to the processes that have access to the file descriptor. The memory
>> region is removed from the kernel page tables and only the page tables of
>> the processes holding the file descriptor map the corresponding physical
>> memory. (Thus, the pages in the region can't be accessed by the kernel
>> itself, so that, for example, pointers to the region can't be passed to
>> system calls.)
>>
>> I'm not sure if the last part is actually true, if the syscalls end up
>> walking user page tables to copy data in/out.
>
>The idea behind removing it from the kernel page tables is so that
>kernel code running in some other process context won't be able to
>reference the memory via the kernel address space. (So if there is
>some kind of kernel zero-day which allows arbitrary code execution,
>the injected attack code would have to play games with page tables
>before being able to reference the memory --- this is not
>*impossible*, just more annoying.)
>
>But if you are doing a buffered write, the copy from the user-supplied
>buffer to the page cache is happening in the process's context. So
>"foreground kernel code" can dereference the user-supplied pointer
>just fine.
>

But the inconsistent treatment in kernel, memfd denied while mmaped-address allowed, is kind of confusing...
I thought those two should be treated the same way....

Thanks
David Wang