Re: objtool clac/stac handling change..

From: Al Viro
Date: Fri Jul 03 2020 - 22:12:28 EST


On Sat, Jul 04, 2020 at 01:49:59AM +0100, Al Viro wrote:
> On Fri, Jul 03, 2020 at 10:02:37PM +0100, Al Viro wrote:
>
> > PS: I'm still going through the _ASM_EXTABLE... users on x86, so there
> > might be more fun. Will post when I'm done...
>
> Lovely... Not directly related to that, but... WTF?
>
> arch/x86/lib/csum-copy_64.S:
>
> /*
> * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
> * potentially unmapped kernel address.
> */
> .macro ignore L=.Lignore
> 30:
> _ASM_EXTABLE(30b, \L)
> .endm
>
> ...
> ignore 2f
> prefetcht0 5*64(%rdi)
> 2:
>
> (and no other users of 'ignore' anywhere). How could prefetcht0 possibly
> raise an exception? Intel manual says that the only exception is #UD if
> LOCK PREFETCHT0 is encountered; not here, obviously. AMD manual simply
> says "no exceptions". Confused...
>
> Incidentally, in the same file:
> SYM_FUNC_START(csum_partial_copy_generic)
> cmpl $3*64, %edx
> jle .Lignore
>
> .Lignore:
> ....
>
> And it had been that way since "[PATCH] Intel x86-64 support merge" back
> in 2004, where we had
> @@ -59,15 +59,6 @@ csum_partial_copy_generic:
> cmpl $3*64,%edx
> jle .Lignore
>
> - ignore
> - prefetch (%rdi)
> - ignore
> - prefetch 1*64(%rdi)
> - ignore
> - prefetchw (%rsi)
> - ignore
> - prefetchw 1*64(%rsi)
> -
> .Lignore:
> ....
> @@ -115,7 +106,7 @@ csum_partial_copy_generic:
> movq 56(%rdi),%r13
>
> ignore 2f
> - prefetch 5*64(%rdi)
> + prefetcht0 5*64(%rdi)
> 2:
> adcq %rbx,%rax
> adcq %r8,%rax
>
> What's going on in there? According to AMD manual, prefetch and prefetchw
> can raise an exception (#UD), if
> PREFETCH/PREFETCHW are not supported, as
> indicated by ECX bit 8 of CPUID function
> 8000_0001h
> Long Mode is not supported, as indicated by EDX
> bit 29 of CPUID function 8000_0001h
> The 3DNow! instructions are not supported, as
> indicated by EDX bit 31 of CPUID function
> 8000_0001h.
> so these at least used to make some sense, but why leave that thing at
> the place where old prefetch became prefetcht0 and what is that comment
> in front of 'ignore' definition about? Exceptions there had never
> been about unmapped addresses - that would make no sense for prefetch.
>
> What am I missing here?

BTW, looking at csum_and_copy_{to,from}_user() callers (all 3 of them,
all in lib/iov_iter.c) we have this:
1) len is never 0
2) sum (initial value of csum) is always 0
3) failure (reported via *err_ptr) is always treateds as "discard
the entire iovec segment (and possibly the entire iovec)". Exact value
put into *err_ptr doesn't matter (it's only compared to 0) and in case of
error the return value is ignored.

Now, using ~0U instead of 0 for initial sum would yield an equivalent csum
(comparable modulo 2^16-1) *AND* never yield 0 (recall how csum addition works).

IOW, we could simply return 0 to indicate an error. Which gives much saner
calling conventions:
__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
copying the damn thing and returning 0 on error or a non-zero value comparable
to csum of the data modulo 2^16-1 on success. Same for csum_and_copy_to_user()
(modulo const and __user being on the other argument).

For x86 it simplifies the instances (both the inline wrappers and asm parts);
I hadn't checked the other architectures yet, but it looks like that should
be doable for all architectures. And it does simplify the callers...