Re: [RFC 2/3] vmalloc: Support grouped page allocations

From: Edgecombe, Rick P
Date: Mon Apr 05 2021 - 17:49:41 EST


On Mon, 2021-04-05 at 22:32 +0100, Matthew Wilcox wrote:
> On Mon, Apr 05, 2021 at 02:01:58PM -0700, Dave Hansen wrote:
> > On 4/5/21 1:37 PM, Rick Edgecombe wrote:
> > > +static void __dispose_pages(struct list_head *head)
> > > +{
> > > +       struct list_head *cur, *next;
> > > +
> > > +       list_for_each_safe(cur, next, head) {
> > > +               list_del(cur);
> > > +
> > > +               /* The list head is stored at the start of the
> > > page */
> > > +               free_page((unsigned long)cur);
> > > +       }
> > > +}
> >
> > This is interesting.
> >
> > While the page is in the allocator, you're using the page contents
> > themselves to store the list_head.  It took me a minute to figure
> > out
> > what you were doing here because: "start of the page" is a bit
> > ambiguous.  It could mean:
> >
> >  * the first 16 bytes in 'struct page'
> > or
> >  * the first 16 bytes in the page itself, aka *page_address(page)
> >
> > The fact that this doesn't work on higmem systems makes this an OK
> > thing
> > to do, but it is a bit weird.  It's also doubly susceptible to bugs
> > where there's a page_to_virt() or virt_to_page() screwup.
> >
> > I was *hoping* there was still sufficient space in 'struct page'
> > for
> > this second list_head in addition to page->lru.  I think there
> > *should*
> > be.  That would at least make this allocator a bit more "normal" in
> > not
> > caring about page contents while the page is free in the
> > allocator.  If
> > you were able to do that you could do things like kmemcheck or page
> > alloc debugging while the page is in the allocator.
> >
> > Anyway, I think I'd prefer that you *try* to use 'struct page'
> > alone.
> > But, if that doesn't work out, please comment the snot out of this
> > thing
> > because it _is_ weird.
>
> Hi!  Current closest-thing-we-have-to-an-expert-on-struct-page here!
>
> I haven't read over these patches yet.  If these pages are in use by
> vmalloc, they can't use mapping+index because get_user_pages() will
> call
> page_mapping() and the list_head will confuse it.  I think it could
> use
> index+private for a list_head.
>
> If the pages are in the buddy, I _think_ mapping+index are free. 
> private
> is in use for buddy order.  But I haven't read through the buddy code
> in a while.
>
> Does it need to be a doubly linked list?  Can it be an hlist?

It does need to be a doubly linked list. I think they should never be
mapped to userspace. As far as the page allocator is concerned these
pages are not free. And they are not compound.

Originally I was just using the lru member. Would it be ok in that
case?