Re: [PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions

From: Måns Rullgård
Date: Wed Nov 25 2015 - 21:20:04 EST


Russell King - ARM Linux <linux@xxxxxxxxxxxxxxxx> writes:

> On Thu, Nov 26, 2015 at 12:50:08AM +0000, Måns Rullgård wrote:
>> If not calling the function saves an I-cache miss, the benefit can be
>> substantial. No, I have no proof of this being a problem, but it's
>> something that could happen.
>
> That's a simplistic view of modern CPUs.
>
> As I've already said, modern CPUs which have branch prediction, but
> they also have speculative instruction fetching and speculative data
> prefetching - which the CPUs which have idiv support will have.
>
> With such features, the branch predictor is able to learn that the
> branch will be taken, and because of the speculative instruction
> fetching, it can bring the cache line in so that it has the
> instructions it needs with minimal or, if working correctly,
> without stalling the CPU pipeline.

It doesn't matter how many fancy features the CPU has. Executing more
branches and using more cache lines puts additional pressure on those
resources, reducing overall performance. Besides, the performance
counters readily show that the prediction is nothing near as perfect as
you seem to believe.

--
Måns Rullgård
mans@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/