Re: [PATCH v2] mm/filemap: Allow arch to request folio size for exec memory
From: Matthew Wilcox
Date: Thu Feb 15 2024 - 19:58:36 EST
On Fri, Feb 16, 2024 at 11:04:00AM +1100, Dave Chinner wrote:
> > The reason for the low liklihood is that the current readahead algorithm
> > starts with an order-2 folio and increases the folio order by 2 every
> > time the readahead mark is hit. But most executable memory is faulted in
> > fairly randomly and so the readahead mark is rarely hit and most
> > executable folios remain order-2.
>
> Yup, this is a bug in the readahead code, and really has nothing to
> do with executable files, mmap or the architecture. We don't want
> some magic new VM_EXEC min folio size per architecture thingy to be
> set - we just want readahead to do the right thing.
>
> Indeed, we are already adding a mapping minimum folio order
> directive to the address space to allow for filesystem block sizes
> greater than PAGE_SIZE. That's the generic mechanism that this
> functionality requires. See here:
>
> https://lore.kernel.org/linux-xfs/20240213093713.1753368-5-kernel@xxxxxxxxxxxxxxxx/
>
> (Probably worth reading some of the other readahead mods in that
> series and the discussion because readahead needs to ensure that it
> fill entire high order folios in a single IO to avoid partial folio
> up-to-date states from partial reads.)
>
> IOWs, it seems to me that we could use this proposed generic mapping
> min order functionality when mmap() is run and VM_EXEC is set to set
> the min order to, say, 64kB. Then the readahead code would simply do
> the right thing, as would all other reads and writes to that
> mapping.
>
> We could trigger this in the ->mmap() method of the filesystem so
> that filesysetms that can use large folios can turn it on, whilst
> other filesystems remain blissfully unaware of the functionality.
> Filesystems could also do smarter things here, too. eg. enable PMD
> alignment for large mapped files....
We already enable PMD alignment for large mapped files (which caused
some shrieking from those who believe that ASLR continues to offer
worthwhile protection), commit efa7df3e3bb5.
My problem with your minimum order proposal is that it would be a
hard failure if we couldn't get an order-4 folio. I think we'd do
better to set the ra_state parameters to 64KiB at mmap time (and I
think that's better done in the MM, not in each FS)