Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc()

From: Michal Hocko
Date: Tue Mar 28 2023 - 03:39:44 EST


On Tue 28-03-23 09:25:35, Mike Rapoport wrote:
> On Mon, Mar 27, 2023 at 03:43:27PM +0200, Michal Hocko wrote:
> > On Sat 25-03-23 09:38:12, Mike Rapoport wrote:
> > > On Fri, Mar 24, 2023 at 09:37:31AM +0100, Michal Hocko wrote:
> > > > On Wed 08-03-23 11:41:02, Mike Rapoport wrote:
> > > > > From: "Mike Rapoport (IBM)" <rppt@xxxxxxxxxx>
> > > > >
> > > > > When set_memory or set_direct_map APIs used to change attribute or
> > > > > permissions for chunks of several pages, the large PMD that maps these
> > > > > pages in the direct map must be split. Fragmenting the direct map in such
> > > > > manner causes TLB pressure and, eventually, performance degradation.
> > > > >
> > > > > To avoid excessive direct map fragmentation, add ability to allocate
> > > > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the
> > > > > allocated pages from the direct map and use a cache of the unmapped pages.
> > > > >
> > > > > This cache is replenished with higher order pages with preference for
> > > > > PMD_SIZE pages when possible so that there will be fewer splits of large
> > > > > pages in the direct map.
> > > > >
> > > > > The cache is implemented as a buddy allocator, so it can serve high order
> > > > > allocations of unmapped pages.
> > > >
> > > > Why do we need a dedicated gfp flag for all this when a dedicated
> > > > allocator is used anyway. What prevents users to call unmapped_pages_{alloc,free}?
> > >
> > > Using unmapped_pages_{alloc,free} adds complexity to the users which IMO
> > > outweighs the cost of a dedicated gfp flag.
> >
> > Aren't those users rare and very special anyway?
> >
> > > For modules we'd have to make x86::module_{alloc,free}() take care of
> > > mapping and unmapping the allocated pages in the modules virtual address
> > > range. This also might become relevant for another architectures in future
> > > and than we'll have several complex module_alloc()s.
> >
> > The module_alloc use is lacking any justification. More context would be
> > more than useful. Also vmalloc support for the proposed __GFP_UNMAPPED
> > likely needs more explanation as well.
>
> Right now module_alloc() boils down to vmalloc() with the virtual range
> limited to the modules area. The allocated chunk contains both code and
> data. When CONFIG_STRICT_MODULE_RWX is set, parts of the memory allocated
> with module_alloc() remapped with different permissions both in vmalloc
> address space and in the direct map. The change of permissions for small
> ranges causes splits of large pages in the direct map.

OK, so you want to reduce that direct map fragmentation? Is that a real
problem? My impression is that modules are mostly static thing. BPF
might be a different thing though. I have a recollection that BPF guys
were dealing with direct map fragmention as well.

> If we were to use unmapped_pages_alloc() in modules_alloc(), we would have
> to implement the part of vmalloc() that reserves the virtual addresses and
> maps the allocated memory there in module_alloc().

Another option would be to provide an allocator for the backing pages to
vmalloc. But I do agree that a gfp flag is a less laborous way to
achieve the same. So the primary question really is whether we really
need vmalloc support for unmapped memory.
--
Michal Hocko
SUSE Labs