Re: [PATCH] x86: Optimize tail handling for copy_user

From: Linus Torvalds
Date: Wed Jul 30 2008 - 13:33:12 EST




On Wed, 30 Jul 2008, Vitaly Mayatskikh wrote:
>
> Another try.

Ok, this is starting to look more reasonable. But you cannot split things
up like this per-file, because the end result doesn't _work_ with the
changes separated.

> BYTES_LEFT_IN_PAGE macro returns PAGE_SIZE, not zero, when the address
> is well aligned to page.

Hmm. Why? If the address is aligned, then we shouldn't even tro to copy
any more, should we? We know we got a fault - and regardless of whether it
was because of some offset off the base pointer or not, if the base
pointer was at offset zero, it's going to be in the same page. So why try
to do an operation we know will fault again?

Also, that's a rather inefficient way to do it, isn't it? Maybe the
compiler can figure it out, but the efficient code would be just

PAGE_SIZE - ((PAGE_SIZE-1) &(unsigned long)ptr)

no? That said, exactly because I think we shouldn't even bother to try to
fix up faults that happened at the beginning of a page, I think the right
one is the one I think I posted originally, ie the one that does just

#define BYTES_LEFT_IN_PAGE(ptr) \
(unsigned int)((PAGE_SIZE-1) & -(long)(ptr))

which is a bit simpler (well, it requires some thought to know why it
works, but it generates good code).

In case you wonder why it works, the operation we _want_ do do is

(PAGE_SIZE - offset-in-page) mod PAGE_SIZE

but subtraction is "stable" in modulus calculus (*), so you can write that
as

(PAGE_SIZE mod PAGE_SIZE - offset-in-page) mod PAGE_SIZE

which is just

(0 - (ptr mod PAGE_SIZE)) mod PAGE_SIZE

but again, subtraction is stable in modulus, so you can write that as

(0 - ptr) mod PAGE_SIZE

and so the result is literally just those single 'neg' and 'and'
instructions (in the macro, you then need all the casting and the
parenthesis, which is why it gets ugly again)

And yes, maybe the compiler figures it all out, but judging by past
experience, things often don't work that well.

Linus

(*) Yeah, in math, it's stable in general, in 2's complement arithmetic
it's only stable in mod 2^n, I guess.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/