Re: [rfc 08/45] cpu alloc: x86 support

From: Christoph Lameter
Date: Wed Nov 21 2007 - 14:02:08 EST


On Wed, 21 Nov 2007, Andi Kleen wrote:

> The whole mapping for all CPUs cannot fit into 2GB of course, but the reference
> linker managed range can.

Ok so you favor the solution where we subtract smp_processor_id() <<
shift?

> > The offset relative to %gs cannot be used if you have a loop and are
> > calculating the addresses for all instances. That is what we are talking
> > about. The CPU_xxx operations that are using the %gs register are fine and
> > are not affected by the changes we are discussing.
>
> Sure it can -- you just get the base address from a global array
> and then add the offset

Ok so generalize the data_offset for that case? I noted that other arches
and i386 have a similar solution there. I fiddled around some more and
found that the overhead that the subtraction introduces is equivalent to
loading an 8 byte constant of the base.

Keeping the usage of data_offset can avoid the shift and the add for the
__get_cpu_var case that needs CPU_PTR( ..., smp_processor_id()) because
the load from data_offset avoid the shifting and adding of
smp_processor_id().

For the loops this is not useful since the compiler can move the
loading of the base pointer outside of the loop )if CPU_PTR needs to load
an 8 byte constant pointers).

With loading the 8 byte base the loops actually become:

sum = 0
ptr = CPU_AREA_BASE
while base < NR_CPUS << shift {
sum = *ptr
ptr += 1 << shift
}

So I think we need to go with the implementation where CPU_PTR(var, cpu)
is

CPU_AREA_BASE + cpu << shift + var_offset

The CPU_AREA_BASE will be loaded into a register. The var_offset usually
ends up being an offset in a mov instruction.

> >
> > > Then the reference data would be initdata and eventually freed.
> > > That is similar to how the current per cpu data works.
> >
> > Yes that is also how the current patchset works. I just do not understand
> > what you want changed.
>
> Anyways i think your current scheme cannot work (too much VM, placed at the wrong
> place; some wrong assumptions).

The constant pointer solution fixes that. No need to despair.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/