On Wed, Oct 07, 2015 at 04:00:42PM -0400, Waiman Long wrote:
On 10/06/2015 05:30 PM, Dave Chinner wrote:<sigh>
I understood what you were saying. However, the per-cpu counterWell, yes. Why do you think the above function does the same/*I don't think that will work as some other CPUs may change the
* Aggregate the per-cpu counter magazines back into the global
* counter. This avoids the need for repeated compare operations to
* run the slow path when the majority of the counter value is held
* in the per-cpu magazines. Folding them back into the global
* counter means we will continue to hit the fast
* percpu_counter_read() path until the counter value falls
* completely within the comparison limit passed to
* __percpu_counter_compare().
*/
static s64 percpu_counter_aggregate(struct percpu_counter *fbc)
{
s64 ret;
int cpu;
unsigned long flags;
raw_spin_lock_irqsave(&fbc->lock, flags);
ret = fbc->count;
for_each_online_cpu(cpu) {
s32 count = __this_cpu_read(*fbc->counters);
ret += count;
__this_cpu_sub(*fbc->counters, count)
}
fbc->count = ret;
raw_spin_unlock_irqrestore(&fbc->lock, flags);
return ret;
}
percpu counters values between percpu_counter_aggregate() and
__percpu_counter_compare(). To be safe, the precise counter has to
be compted whenever the comparison value difference is less than
nr_cpus * batch size.
function as percpu_counter_sum()? So that the percpu_counter_sum()
call *inside* __percpu_counter_compare() can be replaced by this
call. i.e.
return -1;
}
/* Need to use precise count */
- count = percpu_counter_sum(fbc);
+ count = percpu_counter_aggregate(fbc);
if (count> rhs)
return 1;
else if (count< rhs)
Please think about what I'm saying rather than dismissing it without
first understanding my suggestions.
isn't protected by the spinlock. Reading it is OK, but writing may
cause race if that counter is modified by a CPU other than its
owning CPU.
You're still trying to pick apart the code without considering what
we need to acheive. We don't need to the code to be bullet proof to
test whether this hypothesis is correct or not - we just need
something that is "near-enough" to give us the data point to tell us
where we should focus our efforts. If optimising the counter like
above does not reduce the overhead, then we may have to change XFS.
If it does reduce the overhead, then the XFS code remains unchanged
and we focus on optimising the counter code.