Re: hackbench regression due to commit 9dfc6e68bfe6e

From: Tejun Heo
Date: Mon Apr 05 2010 - 21:28:11 EST


Hello,

On 04/06/2010 02:30 AM, Pekka Enberg wrote:
>> Hmnmmm... The dynamic percpu areas use page tables and that data is used
>> in the fast path. Maybe the high thread count causes tlb trashing?
>
> Hmm indeed. I don't see anything particularly funny in the SLUB percpu
> conversion so maybe this is a more issue with the new percpu
> allocator?

By default, percpu allocator embeds the first chunk in the kernel
linear mapping and accesses there shouldn't involve any TLB overhead.
>From the second chunk on, they're mapped page-by-page into vmalloc
area. This can be updated to use larger page mapping but 2M page
per-cpu is pretty large and the trade off hasn't been right yet.

The amount reserved for dynamic allocation in the first chunk is
determined by PERCPU_DYNAMIC_RESERVE constant in
include/linux/percpu.h. It's currently 20k on 64bit machines and 12k
on 32bit. The intention was to size this such that most common stuff
is allocated from this area. The 20k and 12k are numbers that I
pulled out of my ass :-) with the custom config I used. Now that more
stuff has been converted to dynamic percpu, it's quite possible that
the area is too small. Can you please try to increase the size of the
area (say 2 or 4 times) and see whether the performance regression
goes away?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/