Re: [PATCH mmotm] fix broken bootup on 32-bit

From: Hugh Dickins
Date: Sun Mar 06 2011 - 13:01:44 EST

Next message: Alex Deucher: "Re: [regression] 2.6.37-post-rc7 with radeon kms: reproducably locksup hard when using desktop cube of kwin"
Previous message: Florian Mickler: "Re: [PATCH 1/2 v3] [media] dib0700: get rid of on-stack dma buffers"
In reply to: Hugh Dickins: "Re: [PATCH mmotm] fix broken bootup on 32-bit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Mar 5, 2011 at 2:09 PM, MichaÅ Nazarewicz <mina86@xxxxxxxxxx> wrote:
> On Mar 5, 2011 8:49 PM, "Hugh Dickins" <hughd@xxxxxxxxxx> wrote:
>> I realize that zeroes are handled, but I was imagining that one branch
>> taken (for numbers up to 9999) is cheaper than four out-of-line function
>> calls, six divisions-or-modulos by constant 10000, three multiplications
>> by constants; oh, and a lot more once I look inside put_dec_full4().
>>
>> Is that not the case? ÂIsn't performance the justification for this magic?
>
> It turns out that difference in speed is minimal and inconclusive, as the
> version without cascading ifs seems to perform better on ARM. So because my
> benchmarks didn't show a clear winner, we can go with a shorter version.

At first I was surprised by that, but now I'm suspecting that it's a
severe flaw in your benchmarking.

Am I right to think that you are measuring the performance of the
algorithms on random unsigned long longs? Which are very unlikely to
have all the upper 16 bits unset? Let alone the upper 32 bits or the
upper 48 bits all unset?

Whereas, what would be the distribution of numbers that the kernel is
typically called upon to vsprintf? I put it to you that numbers with
the upper 48 bits all unset would predominate, followed by those with
just the upper 32 bits unset.

I'm sure there are u64s and s64s and unsigned long longs and long
longs to be found and printed, but the mm statistics I just looked up
appear to be merely unsigned longs, just 32 bits on 32-bit; and even
in the 64-bit case, I'd still expect that lower numbers would
generally predominate.

I suspect that, without more branching than you have at present, your
new algorithms actually slow down the kernel.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Alex Deucher: "Re: [regression] 2.6.37-post-rc7 with radeon kms: reproducably locksup hard when using desktop cube of kwin"
Previous message: Florian Mickler: "Re: [PATCH 1/2 v3] [media] dib0700: get rid of on-stack dma buffers"
In reply to: Hugh Dickins: "Re: [PATCH mmotm] fix broken bootup on 32-bit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]