Re: [patch] speed up / fix the new generic semaphore code (fix AIM740% regression with 2.6.26-rc1)
From: Linus Torvalds
Date: Thu May 08 2008 - 19:14:53 EST
On Thu, 8 May 2008, Linus Torvalds wrote:
>
> Btw, sparse will complain about those, because the source code *looks*
> really cheap.
Sometimes you can fix it.
For example, this change:
- if (pte_present(*pte) && page_to_pfn(page) == pte_pfn(*pte)) {
+ if (pte_present(*pte) && page == pfn_to_page(pte_pfn(*pte))) {
can simplify things: instead of moving from a 'struct page' to a pfn, it
moves from a pfn to a 'struct page', and that is generally cheaper
(multiply rather than divide by size of struct page). It's not always the
same thing to do, but I think in this case we can. For me, the code
generation changes:
- movabsq $7905747460161236407, %rdx #, tmp111
- movabsq $32985348833280, %rax #, tmp107
- leaq (%r12,%rax), %rax #, tmp106
- sarq $3, %rax #, tmp106
- imulq %rdx, %rax # tmp111, tmp106
- movabsq $70368744177663, %rdx #, tmp113
- andq %rdx, %rcx # tmp113, pte$pte
- shrq $12, %rcx #, pte$pte
- cmpq %rcx, %rax # pte$pte, tmp106
+ movabsq $70368744177663, %rax #, tmp107
+ andq %rax, %rdx # tmp107, pte$pte
+ shrq $12, %rdx #, pte$pte
+ imulq $56, %rdx, %rax #, pte$pte, tmp109
+ movabsq $-32985348833280, %rdx #, tmp111
+ addq %rdx, %rax # tmp111, tmp110
+ cmpq %rax, %r13 # tmp110, page
which isn't a *huge* deal, but it certainly looks better. One less big
constant, and one less shift.
It's not going to make a huge difference, though. That function is just
called too much, and it would still be entirely data-dependent all the way
through.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/