RE: [PATCH] ia64: change defconfig to NR_CPUS==1024

From: Chen, Kenneth W
Date: Thu Jan 05 2006 - 17:32:13 EST


hawkes@xxxxxxx wrote on Thursday, January 05, 2006 1:40 PM
> The downside is that the ia64 cpumask increases from 8 words to 16.
> I have tried various heavy workloads and have seen no significant
> measurable performance regression from this increase.

What type of heavy workloads have you measured? Including db transaction
processing and decision making workloads?

> The potential
> extra cachemiss seems to be lost in the noise. The for_each_*cpu()
> macros are relatively efficient in skipping past zeroed cpumask bits.
> Workloads that impose higher loads on the CPU Scheduler tend to
> bottleneck on non-Scheduler parts of the kernel, and it's the Scheduler
> which makes the principal use of the cpumask_t, so these extra
> cachemiss inefficiencies and extra CPU cycles to scan zero mask words
> just get lost in the general system overhead.

I found above claims are generally false for workload that puts tons
of pressure on CPU cache, especially with db workload. Typically
for db workload, the working set in user space is so large that making
a trip into the kernel has far large secondary effect then the primary
cache miss occurred in the kernel. In other word, cache lines evicted
by the kernel code have far larger impact to the overall application
performance and leads to lower overall lower system performance. So
when you say "get lost in the general system overhead", did you consider
the secondary effect it does to the application performance?

What we found is going from NR_CPU = 64 to 128, it has small performance
impact to db transaction processing workload. Though I have not measured
difference between 128 to 1024.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/