Re: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings
From: Dan Williams
Date: Tue Oct 29 2019 - 01:44:16 EST
On Mon, Oct 28, 2019 at 6:16 AM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> On Mon, Oct 28, 2019 at 02:00:19PM +0100, Mike Rapoport wrote:
> > On Mon, Oct 28, 2019 at 03:31:24PM +0300, Kirill A. Shutemov wrote:
> > > On Sun, Oct 27, 2019 at 12:17:32PM +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > > >
> > > > The mappings created with MAP_EXCLUSIVE are visible only in the context of
> > > > the owning process and can be used by applications to store secret
> > > > information that will not be visible not only to other processes but to the
> > > > kernel as well.
> > > >
> > > > The pages in these mappings are removed from the kernel direct map and
> > > > marked with PG_user_exclusive flag. When the exclusive area is unmapped,
> > > > the pages are mapped back into the direct map.
> > >
> > > I probably blind, but I don't see where you manipulate direct map...
> > __get_user_pages() calls __set_page_user_exclusive() which in turn calls
> > set_direct_map_invalid_noflush() that makes the page not present.
> Ah. okay.
> I think active use of this feature will lead to performance degradation of
> the system with time.
> Setting a single 4k page non-present in the direct mapping will require
> splitting 2M or 1G page we usually map direct mapping with. And it's one
> way road. We don't have any mechanism to map the memory with huge page
> again after the application has freed the page.
> It might be okay if all these pages cluster together, but I don't think we
> have a way to achieve it easily.
Still, it would be worth exploring what that would look like if not
for MAP_EXCLUSIVE then set_mce_nospec() that wants to punch out poison
pages from the direct map. In the case of pmem, where those pages are
able to be repaired, it would be nice to also repair the mapping
granularity of the direct map.