fast_copy_page sections?!

From: Karl Vogel (karl.vogel@seagha.com)
Date: Fri Jun 20 2003 - 03:38:58 EST

Next message: Frederick, Fabian: "No modules symbols loaded"
Previous message: Eivind Tagseth: "Problems with PCMCIA Compact Flash adapter in 2.5.72"
Next in thread: Arjan van de Ven: "Re: fast_copy_page sections?!"
Reply: Arjan van de Ven: "Re: fast_copy_page sections?!"
Maybe reply: Karl Vogel: "RE: fast_copy_page sections?!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

In arch/i386/lib/mmx.c there is a function fast_copy_page() using some
crafty asm statements. However I don't quite understand the part with the
fixup sections?! Can anybody shed some light into was is going on here?!

----- arch/i386/lib/mmx.c
static void fast_copy_page(void *to, void *from)
{
int i;

kernel_fpu_begin();

        /* maybe the prefetch stuff can go before the expensive fnsave...
         * but that is for later. -AV
         */
        __asm__ __volatile__ (
                "1: prefetch (%0)\n"
                " prefetch 64(%0)\n"
                " prefetch 128(%0)\n"
                " prefetch 192(%0)\n"
                " prefetch 256(%0)\n"
                "2: \n"
                ".section .fixup, \"ax\"\n"
                "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
                " jmp 2b\n"
                ".previous\n"
                ".section __ex_table,\"a\"\n"
                " .align 4\n"
                " .long 1b, 3b\n"
                ".previous"
                : : "r" (from) );

        for(i=0; i<(4096-320)/64; i++)
        {
                __asm__ __volatile__ (
                "1: prefetch 320(%0)\n"
                "2: movq (%0), %%mm0\n"
                " movntq %%mm0, (%1)\n"
                " movq 8(%0), %%mm1\n"
                " movntq %%mm1, 8(%1)\n"
                " movq 16(%0), %%mm2\n"
                " movntq %%mm2, 16(%1)\n"
                " movq 24(%0), %%mm3\n"
                " movntq %%mm3, 24(%1)\n"
                " movq 32(%0), %%mm4\n"
                " movntq %%mm4, 32(%1)\n"
                " movq 40(%0), %%mm5\n"
                " movntq %%mm5, 40(%1)\n"
                " movq 48(%0), %%mm6\n"
                " movntq %%mm6, 48(%1)\n"
                " movq 56(%0), %%mm7\n"
                " movntq %%mm7, 56(%1)\n"
                ".section .fixup, \"ax\"\n"
                "3: movw $0x05EB, 1b\n" /* jmp on 5 bytes */
                " jmp 2b\n"
                ".previous\n"
                ".section __ex_table,\"a\"\n"
                " .align 4\n"
                " .long 1b, 3b\n"
                ".previous"
                : : "r" (from), "r" (to) : "memory");
                from+=64;
                to+=64;
        }
--- cut the remainder ---

Also.. a while ago, there were some posts about a faster version. Does
anybody know why that version was never used?! (did nobody send a patch, or
was there something wrong with it?)

Message archive at:
http://www.ussg.iu.edu/hypermail/linux/kernel/0210.3/0156.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Frederick, Fabian: "No modules symbols loaded"
Previous message: Eivind Tagseth: "Problems with PCMCIA Compact Flash adapter in 2.5.72"
Next in thread: Arjan van de Ven: "Re: fast_copy_page sections?!"
Reply: Arjan van de Ven: "Re: fast_copy_page sections?!"
Maybe reply: Karl Vogel: "RE: fast_copy_page sections?!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Jun 23 2003 - 22:00:32 EST