Re: [PATCH 2/4] mm: mempool: introduce page bulk allocator
From: Yang Shi
Date: Tue Oct 18 2022 - 14:02:00 EST
On Mon, Oct 17, 2022 at 2:41 AM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Oct 13, 2022 at 01:16:31PM -0700, Yang Shi wrote:
> > On Thu, Oct 13, 2022 at 5:38 AM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Oct 05, 2022 at 11:03:39AM -0700, Yang Shi wrote:
> > > > Since v5.13 the page bulk allocator was introduced to allocate order-0
> > > > pages in bulk. There are a few mempool allocator callers which does
> > > > order-0 page allocation in a loop, for example, dm-crypt, f2fs compress,
> > > > etc. A mempool page bulk allocator seems useful. So introduce the
> > > > mempool page bulk allocator.
> > > >
> > > > It introduces the below APIs:
> > > > - mempool_init_pages_bulk()
> > > > - mempool_create_pages_bulk()
> > > > They initialize the mempool for page bulk allocator. The pool is filled
> > > > by alloc_page() in a loop.
> > > >
> > > > - mempool_alloc_pages_bulk_list()
> > > > - mempool_alloc_pages_bulk_array()
> > > > They do bulk allocation from mempool.
> > > > They do the below conceptually:
> > > > 1. Call bulk page allocator
> > > > 2. If the allocation is fulfilled then return otherwise try to
> > > > allocate the remaining pages from the mempool
> > > > 3. If it is fulfilled then return otherwise retry from #1 with sleepable
> > > > gfp
> > > > 4. If it is still failed, sleep for a while to wait for the mempool is
> > > > refilled, then retry from #1
> > > > The populated pages will stay on the list or array until the callers
> > > > consume them or free them.
> > > > Since mempool allocator is guaranteed to success in the sleepable context,
> > > > so the two APIs return true for success or false for fail. It is the
> > > > caller's responsibility to handle failure case (partial allocation), just
> > > > like the page bulk allocator.
> > > >
> > > > The mempool typically is an object agnostic allocator, but bulk allocation
> > > > is only supported by pages, so the mempool bulk allocator is for page
> > > > allocation only as well.
> > > >
> > > > Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> > >
> > > Overall, I think it's an ok approach and certainly a good use case for
> > > the bulk allocator.
> > >
> > > The main concern that I have is that the dm-crypt use case doesn't really
> > > want to use lists as such and it's just a means for collecting pages to pass
> > > to bio_add_page(). bio_add_page() is working with arrays but you cannot
> > > use that array directly as any change to how that array is populated will
> > > then explode. Unfortunately, what you have is adding pages to a list to
> > > take them off the list and put them in an array and that is inefficient.
> >
> > Yeah, I didn't think of a better way to pass the pages to dm-crypt.
> >
> > >
> > > How about this
> > >
> > > 1. Add a callback to __alloc_pages_bulk() that takes a page as a
> > > parameter like bulk_add_page() or whatever.
> > >
> > > 2. For page_list == NULL && page_array == NULL, the callback is used
> > >
> > > 3. Add alloc_pages_bulk_cb() that passes in the name of a callback
> > > function
> > >
> > > 4. In the dm-crypt case, use the callback to pass the page to bio_add_page
> > > for the new page allocated.
> >
> > Thank you so much for the suggestion. But I have a hard time
> > understanding how these work together. Do you mean call bio_add_page()
> > in the callback? But bio_add_page() needs other parameters. Or I
> > misunderstood you?
> >
>
> I expected dm-crypt to define the callback. Using bio_add_page
> directly would not work as the bulk allocator has no idea what to pass
> bio_add_page. dm-crypt would likely need to create both a callback and an
> opaque data structure passed as (void *) to track "clone" and "len"
I see. Yeah, we have to pass the "clone" and "len" to the callback via
pool_data. It should not be hard since dm-crypt already uses
crypt_config to maintain a counter for allocated pages, we should just
need to pass the struct to the callback as a parameter.
But I'm wondering whether this is worth it or not? Will it make the
code harder to follow?
>
> --
> Mel Gorman
> SUSE Labs