Re: [RFC/PATCH] Make powerpc64 use __thread for per-cpu variables
From: Paul Mackerras
Date: Wed May 10 2006 - 02:30:00 EST
Olof Johansson writes:
> Looks like alot of the text growth is from the added mfsprg3 instructions:
Yes, probably mostly from current.
> ... so, as the PACA gets deprecated, the bloat will go away again.
Yes. I was hoping to get rid of the paca entirely, but that would
mean it would have to be in the RMA (so that the early exception entry
code can use it) which means that it won't be node-local any more.
I have a patch which allocates the per-cpu areas in the RMA but now
I'm rethinking it, since Ben H (at least) thinks the per-cpu area
really needs to be node-local.
Moving current to a per-cpu variable means that we need to allocate at
least the boot cpu's per-cpu area earlier than we do now, since it
seems that printk references current. That makes it hard to make sure
the boot cpu's per-cpu area is node-local, unless we do something
tricky like reallocating it once the bootmem allocator is available.
> It would be interesting to see benchmarks of how much it improves
> things. I guess it doesn't really get interesting until after the paca
> gets removed though, due to the added mfsprg's.
I have moved current, smp_processor_id and a couple of other things to
per-cpu variables, and that results in the kernel text being about 8k
smaller than without any of these __thread patches. Performance seems
to be very slightly better but it's hard to be sure that the change is
statistically significant, from the measurements I've done so far.
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/