Re: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics
From: Jakub Jelinek
Date: Mon Feb 13 2023 - 06:03:13 EST
On Fri, Feb 10, 2023 at 09:07:14PM +0100, Marco Elver wrote:
> On Fri, 10 Feb 2023 at 20:25, Jakub Jelinek <jakub@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote:
> > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
> > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
> > >
> > > GCC does not yet have similar support.
> >
> > GCC has support to rename memcpy/memset etc. for years, say on
> > following compiled with
> > -fsanitize=kernel-address -O2 -mstringop-strategy=libcall
> > (the last option just to make sure the compiler doesn't prefer to emit
> > rep mov*/stos* or loop or something similar, of course kernel can keep
> > whatever it uses) you'll get just __asan_memcpy/__asan_memset calls,
> > no memcpy/memset, while without -fsanitize=kernel-address you get
> > normally memcpy/memset.
>
> > Or do you need the __asan_* functions only in asan instrumented functions
> > and normal ones in non-instrumented functions in the same TU?
>
> Yes, exactly that: __asan_ in instrumented, and normal ones in
> no_sanitize functions; they can be mixed in the same TU. We can't
> rename normal mem*() functions everywhere. In no_sanitize functions
> (in particular noinstr), normal mem*() should be used. But in
> instrumented code, it should be __asan_mem*(). Another longer
> explanation I also just replied here:
> https://lore.kernel.org/all/CANpmjNNH-O+38U6zRWJUCU-eJTfMhUosy==GWEOn1vcu=J2dcw@xxxxxxxxxxxxxx/
>
> At least clang has had this behaviour for user space ASan forever:
> https://godbolt.org/z/h5sWExzef - so it was easy to just add the flag
> to make it behave like in user space for mem*() in the kernel. It
> might also be worthwhile for GCC to emit __asan_ for user space, given
> that the runtimes are shared and the user space runtime definitely has
> __asan_. The kernel needs the param (asan-kernel-mem-intrinsic-prefix)
> though, to not break older kernels.
So, what exactly you want for gcc to do with
--param asan-kernel-mem-intrinsic-prefix=1 (note, in GCC case it can't be
without the =1 at the end)?
The current gcc behavior is that operations like aggregate copies, or
clearing which might or might not need memcpy/memset/memmove under the hood
later are asan instrumented before the operation (in order not to limit the
choices on how it will be expanded), uses of builtins (__builtin_ prefixed
or not) are also instrumented before the calls unless they are one of the
calls that is recognized as always instrumented. None for hwasan,
for asan:
index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr,
strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn,
strpbrk, strspn, strstr, strncpy
and for those builtins gcc disables inline expansion and enforces a library
call (but until the expansion they are treated in optimizations like normal
builtins and so could be say DCEd, or their aliasing behavior is considered
etc.). kasan behaves the same I think.
Now, I think libasan only has __asan_ prefixed
__asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of
the calls from the above list even can't be prefixed.
So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_
prefix just memcpy/memmove/memset and nothing else? Is it ok to emit
memcpy/memset/memmove from aggregate operations which are instrumented
already at the caller (and similarly is it ok to handle those operations
inline)?
Jakub