Re: [PATCH] mm/sparse: Consistently do not zero memmap

From: Pavel Tatashin
Date: Wed Oct 30 2019 - 12:53:55 EST


On Wed, Oct 30, 2019 at 11:31 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Wed 30-10-19 11:20:44, Pavel Tatashin wrote:
> > On Wed, Oct 30, 2019 at 10:13 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > [Add Pavel - the email thread starts http://lkml.kernel.org/r/20191030131122.8256-1-vincent.whitchurch@xxxxxxxx
> > > but it used your old email address]
> > >
> > > On Wed 30-10-19 15:02:16, Vincent Whitchurch wrote:
> > > > On Wed, Oct 30, 2019 at 02:29:58PM +0100, Michal Hocko wrote:
> > > > > On Wed 30-10-19 14:11:22, Vincent Whitchurch wrote:
> > > > > > (I noticed this because on my ARM64 platform, with 1 GiB of memory the
> > > > > > first [and only] section is allocated from the zeroing path while with
> > > > > > 2 GiB of memory the first 1 GiB section is allocated from the
> > > > > > non-zeroing path.)
> > > > >
> > > > > Do I get it right that sparse_buffer_init couldn't allocate memmap for
> > > > > the full node for some reason and so sparse_init_nid would have to
> > > > > allocate one for each memory section?
> > > >
> > > > Not quite. The sparsemap_buf is successfully allocated with the correct
> > > > size in sparse_buffer_init(), but sparse_buffer_alloc() fails to
> > > > allocate the same size from it.
> > > >
> > > > The reason it fails is that sparse_buffer_alloc() for some reason wants
> > > > to return a pointer which is aligned to the allocation size. But the
> > > > sparsemap_buf was only allocated with PAGE_SIZE alignment so there's not
> > > > enough space to align it.
> > > >
> > > > I don't understand the reason for this alignment requirement since the
> > > > fallback path also allocates with PAGE_SIZE alignment. I'm guessing the
> > > > alignment is for the VMEMAP code which also uses sparse_buffer_alloc()?
> > >
> > > I am not 100% sure TBH. Aligning makes some sense when mapping the
> > > memmaps to page tables but that would suggest that sparse_buffer_init
> > > is using a wrong alignment then. It is quite wasteful to allocate
> > > alarge misaligned block like that.
> > >
> > > Your patch still makes sense but this is something to look into.
> > >
> > > Pavel?
> >
> > I remember thinking about this large alignment, as it looked out of
> > place to me also.
> > It was there to keep memmap in single chunks on larger x86 machines.
> > Perhaps it can be revisited now.
>
> Don't we need 2MB aligned memmaps for their PMD mappings?

Yes, PMD_SIZE should be the alignment here. It just does not make
sense to align to size.

Pasha