Re: [PATCH 0/3 v5] Introduce a bulk order-0 page allocator

From: Matthew Wilcox
Date: Tue Mar 23 2021 - 07:15:54 EST


On Mon, Mar 22, 2021 at 08:32:54PM +0000, Chuck Lever III wrote:
> > It's not expected that the array implementation would be worse *unless*
> > you are passing in arrays with holes in the middle. Otherwise, the success
> > rate should be similar.
>
> Essentially, sunrpc will always pass an array with a hole.
> Each RPC consumes the first N elements in the rq_pages array.
> Sometimes N == ARRAY_SIZE(rq_pages). AFAIK sunrpc will not
> pass in an array with more than one hole. Typically:
>
> .....PPPP
>
> My results show that, because svc_alloc_arg() ends up calling
> __alloc_pages_bulk() twice in this case, it ends up being
> twice as expensive as the list case, on average, for the same
> workload.

Can you call memmove() to shift all the pointers down to be the
first N elements? That prevents creating a situation where we have

PPPPPPPP (consume 6)
......PP (try to allocate 6, only 4 available)
PPPP..PP

instead, you'd do:

PPPPPPPP (consume 6)
PP...... (try to allocate 6, only 4 available)
PPPPPP..

Alternatively, you could consume from the tail of the array instead of
the head. Some CPUs aren't as effective about backwards walks as they
are for forwards walks, but let's keep the pressure on CPU manufacturers
to make better CPUs.