Re: Big git diff speedup by avoiding x86 "fast string" memcmp

From: Boaz Harrosh
Date: Thu Dec 16 2010 - 04:53:19 EST

Next message: Henrik Rydberg: "[PATCH] hid: egalax: Add support for Samsung NB30 netbook"
Previous message: stefani: "[PATCH] cramfs: generate unique inode number for better inode cache usage"
Next in thread: Nick Piggin: "Re: Big git diff speedup by avoiding x86 "fast string" memcmp"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 12/15/2010 08:00 PM, David Miller wrote:
> From: Boaz Harrosh <bharrosh@xxxxxxxxxxx>
> Date: Wed, 15 Dec 2010 15:15:09 +0200
>
>> I agree that the byte-compare or long-compare should give you very close
>> results in modern pipeline CPUs. But surly 12 increments-and-test should
>> show up against 3 (or even 2). I would say it must be a better plan.
>
> For strings of these lengths the setup code necessary to initialize
> the inner loop and the tail code to handle the sub-word ending cases
> eliminate whatever gains there are.
>

You miss understood me. I'm saying that we know the beggining of the
string is aligned and Nick offered to pad the last long, so surly
a shift by 2 (or 3) + the reduction of the 12 dec-and-test to 3
should give you an optimization?

> I know this as I've been hacking on assembler optimized strcmp() and
> memcmp() in my spare time over the past year or so.

Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Henrik Rydberg: "[PATCH] hid: egalax: Add support for Samsung NB30 netbook"
Previous message: stefani: "[PATCH] cramfs: generate unique inode number for better inode cache usage"
Next in thread: Nick Piggin: "Re: Big git diff speedup by avoiding x86 "fast string" memcmp"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]