Re: [PATCH] mm, slub: prefetch freelist in ___slab_alloc()
From: Hyeonggon Yoo
Date: Mon Aug 19 2024 - 05:34:01 EST
On Mon, Aug 19, 2024 at 4:02 PM Yongqiang Liu <liuyongqiang13@xxxxxxxxxx> wrote:
>
> commit 0ad9500e16fe ("slub: prefetch next freelist pointer in
> slab_alloc()") introduced prefetch_freepointer() for fastpath
> allocation. Use it at the freelist firt load could have a bit
> improvement in some workloads. Here is hackbench results at
> arm64 machine(about 3.8%):
>
> Before:
> average time cost of 'hackbench -g 100 -l 1000': 17.068
>
> Afther:
> average time cost of 'hackbench -g 100 -l 1000': 16.416
>
> There is also having about 5% improvement at x86_64 machine
> for hackbench.
I think adding more prefetch might not be a good idea unless we have
more real-world data supporting it because prefetch might help when slab
is frequently used, but it will end up unnecessarily using more cache
lines when slab is not frequently used.
Also I don't understand how adding prefetch in slowpath affects the performance
because most allocs/frees should be done in the fastpath. Could you
please explain?
> Signed-off-by: Yongqiang Liu <liuyongqiang13@xxxxxxxxxx>
> ---
> mm/slub.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index c9d8a2497fd6..f9daaff10c6a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3630,6 +3630,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> VM_BUG_ON(!c->slab->frozen);
> c->freelist = get_freepointer(s, freelist);
> c->tid = next_tid(c->tid);
> + prefetch_freepointer(s, c->freelist);
> local_unlock_irqrestore(&s->cpu_slab->lock, flags);
> return freelist;
>
> --
> 2.25.1
>