Re: [PATCH 09/10] percpu: implement new dynamic percpu allocator

From: Rusty Russell
Date: Mon Feb 23 2009 - 21:56:43 EST


On Friday 20 February 2009 13:31:21 Tejun Heo wrote:
> > One question. Are you thinking that to be defined by every SMP arch
> > long-term?
>
> Yeap, definitely.

Excellent. That opens some really nice stuff.

> > Because there are benefits in having &<percpuvar> == valid
> > percpuptr, such as passing them around as parameters. If so, IA64
> > will want a dedicated per-cpu area for statics (tho it can probably
> > just map it somehow, but it has to be 64k).
>
> Hmmm... Don't have much idea about ia64 and its magic 64k. Can it
> somehow be used for the first chunk?

Yes, but I think that chunk must not be handed out for dynamic allocations
but kept in reserve for modules.

IA64 uses a pinned TLB entry to map this cpu's 64k at __phys_per_cpu_start.
See __ia64_per_cpu_var() in arch/ia64/include/asm/percpu.h. This means they
can also optimize cpu_local_* and read_cpuvar (or whatever it's called now).
IIUC IA64 needs this region internally, using it for percpu vars is a bonus.

> > These pseudo-constants seem like a really weird thing to do to me.
>
> I explained this in the reply to Andrew's comment. It's
> non-really-constant-but-should-be-considered-so-by-users thing. Is it
> too weird? Even if I add comment explaning it?

It's weird; I'd make them __read_mostly and be done with it.

> > rbtree might be overkill on first cut. I'm bearing in mind that Christoph L
> > had a nice patch to use dynamic percpu allocation in the sl*b allocators;
> > which would mean this needs to only use get_free_page.
>
> Hmmm... the reverse mapping can be piggy backed on vmalloc by adding a
> private pointer to the vm_struct but rbtree isn't too difficult to use
> so I just did it directly. Nick, what do you think about adding
> private field to vm_struct and providing a reverse map function?

Naah, just walk the arrays to do the mapping. Cuts a heap of code, and
we can optimize when someone complains :)

Walking arrays is cache friendly, too.

> As for the sl*b allocation thing, can you please explain in more
> detail or point me to the patches / threads?

lkml from 2008-05-30:

Message-Id: <20080530040021.800522644@xxxxxxx>:
Subject: [patch 32/41] cpu alloc: Use in slub
And:
Subject: [patch 33/41] cpu alloc: Remove slub fields
Subject: [patch 34/41] cpu alloc: Page allocator conversion

> Thanks. :-)

Don't thank me: you're doing all the work!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/