Re: [RFC 2/3] vmalloc: Support grouped page allocations

From: Matthew Wilcox
Date: Mon Apr 05 2021 - 17:33:42 EST


On Mon, Apr 05, 2021 at 02:01:58PM -0700, Dave Hansen wrote:
> On 4/5/21 1:37 PM, Rick Edgecombe wrote:
> > +static void __dispose_pages(struct list_head *head)
> > +{
> > + struct list_head *cur, *next;
> > +
> > + list_for_each_safe(cur, next, head) {
> > + list_del(cur);
> > +
> > + /* The list head is stored at the start of the page */
> > + free_page((unsigned long)cur);
> > + }
> > +}
>
> This is interesting.
>
> While the page is in the allocator, you're using the page contents
> themselves to store the list_head. It took me a minute to figure out
> what you were doing here because: "start of the page" is a bit
> ambiguous. It could mean:
>
> * the first 16 bytes in 'struct page'
> or
> * the first 16 bytes in the page itself, aka *page_address(page)
>
> The fact that this doesn't work on higmem systems makes this an OK thing
> to do, but it is a bit weird. It's also doubly susceptible to bugs
> where there's a page_to_virt() or virt_to_page() screwup.
>
> I was *hoping* there was still sufficient space in 'struct page' for
> this second list_head in addition to page->lru. I think there *should*
> be. That would at least make this allocator a bit more "normal" in not
> caring about page contents while the page is free in the allocator. If
> you were able to do that you could do things like kmemcheck or page
> alloc debugging while the page is in the allocator.
>
> Anyway, I think I'd prefer that you *try* to use 'struct page' alone.
> But, if that doesn't work out, please comment the snot out of this thing
> because it _is_ weird.

Hi! Current closest-thing-we-have-to-an-expert-on-struct-page here!

I haven't read over these patches yet. If these pages are in use by
vmalloc, they can't use mapping+index because get_user_pages() will call
page_mapping() and the list_head will confuse it. I think it could use
index+private for a list_head.

If the pages are in the buddy, I _think_ mapping+index are free. private
is in use for buddy order. But I haven't read through the buddy code
in a while.

Does it need to be a doubly linked list? Can it be an hlist?