Re: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings

From: Edgecombe, Rick P
Date: Tue Oct 29 2019 - 13:27:49 EST


On Mon, 2019-10-28 at 22:00 +0100, Peter Zijlstra wrote:
> On Mon, Oct 28, 2019 at 07:59:25PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2019-10-28 at 14:55 +0100, Peter Zijlstra wrote:
> > > On Mon, Oct 28, 2019 at 04:16:23PM +0300, Kirill A. Shutemov wrote:
> > >
> > > > I think active use of this feature will lead to performance degradation
> > > > of
> > > > the system with time.
> > > >
> > > > Setting a single 4k page non-present in the direct mapping will require
> > > > splitting 2M or 1G page we usually map direct mapping with. And it's one
> > > > way road. We don't have any mechanism to map the memory with huge page
> > > > again after the application has freed the page.
> > >
> > > Right, we recently had a 'bug' where ftrace triggered something like
> > > this and facebook ran into it as a performance regression. So yes, this
> > > is a real concern.
> >
> > Don't e/cBPF filters also break the direct map down to 4k pages when calling
> > set_memory_ro() on the filter for 64 bit x86 and arm?
> >
> > I've been wondering if the page allocator should make some effort to find a
> > broken down page for anything that can be known will have direct map
> > permissions
> > changed (or if it already groups them somehow). But also, why any potential
> > slowdown of 4k pages on the direct map hasn't been noticed for apps that do
> > a
> > lot of insertions and removals of BPF filters, if this is indeed the case.
>
> That should be limited to the module range. Random data maps could
> shatter the world.

BPF has one vmalloc space allocation for the byte code and one for the module
space allocation for the JIT. Both get RO also set on the direct map alias of
the pages, and reset RW when freed.

You mean shatter performance?