Re: [PATCH] kasan: turn off asan-stack for clang-8 and earlier
From: Arnd Bergmann
Date: Thu Feb 21 2019 - 12:19:30 EST
On Thu, Feb 21, 2019 at 12:47 AM Kostya Serebryany <kcc@xxxxxxxxxx> wrote:
>
> On Wed, Feb 20, 2019 at 2:12 PM Kostya Serebryany <kcc@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 20, 2019 at 1:40 PM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > >
> > > On Wed, Feb 20, 2019 at 10:13 PM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > >
> > > > In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12
> > > > (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses
> > > > over ten times as much stack space as gcc, for reasons I still
> > > > can't explain. My assumption right now is that the underlying bug
> > > > causes most of the problems with excessive stack usage in
> > > > allmodconfig kernels.
> > >
> > > Here is an even more minimal example:
> > >
> > > struct s { int i[5]; } f(void);
> > > void g(void) { f(); f();}
> >
> > On this example I can see some stupidity that clang/asan is doing.
> > Let me try fixing it and see if it helps bigger cases.
> > Thanks for reducing the case!
> >
> > This is the input we get in the asan instrumentation:
> >
> > ; Function Attrs: noinline nounwind optnone sanitize_address uwtable
> > define dso_local void @g() #0 {
> > entry:
> > %tmp = alloca %struct.s, align 4 <<<<<<<<<<<<<<<<<<<<<<<
> > %tmp1 = alloca %struct.s, align 4
> > %0 = bitcast %struct.s* %tmp to i8*
> > call void @llvm.lifetime.start.p0i8(i64 20, i8* %0) #3
> > call void @f(%struct.s* sret %tmp)
> > %1 = bitcast %struct.s* %tmp to i8* <<<<<<<<<<<<<<<<<<<<<
> > call void @llvm.lifetime.end.p0i8(i64 20, i8* %1) #3
> > %2 = bitcast %struct.s* %tmp1 to i8*
> > call void @llvm.lifetime.start.p0i8(i64 20, i8* %2) #3 <<<<<<<<<<<<<
> > call void @f(%struct.s* sret %tmp1)
> > %3 = bitcast %struct.s* %tmp1 to i8*
> > call void @llvm.lifetime.end.p0i8(i64 20, i8* %3) #3
> > ret void
> > }
>
> Err.. taking my words back.
> These allocas *are* used other then in lifetime markers, since they
> are passed to f() as 'sret'.
Thanks a lot for the analysis!
> And we can not drop instrumentation for such allocas. Example:
>
> static volatile int zero = 0;
> typedef struct {
> int ar[5];
> } S;
> S foo() {
> S s;
> s.ar[zero + 6] = 42;
> return s;
> }
> int main(int argc, char **argv) {
> S s = foo();
> return s.ar[argc];
> }
>
> % clang -g -O1 -fsanitize=address sret.c && ./a.out
>
> ==5822==ERROR: AddressSanitizer: stack-buffer-overflow on address
> 0x7ffcad027f78 at pc 0x0000004f878d bp 0x7ffcad027f20 sp
> 0x7ffcad027f18
> WRITE of size 4 at 0x7ffcad027f78 thread T0
> #0 0x4f878c in foo sret.c:7:18
> #1 0x4f8838 in main sret.c:11:9
> Address 0x7ffcad027f78 is located in stack of thread T0 at offset 56 in frame
> #0 0x4f879f in main sret.c:10
>
> This frame has 1 object(s):
> [32, 52) 's' (line 11) <== Memory access at offset 56 overflows
> this variable
>
> Here we have a struct return that needs to be instrumented inside main
> so that a buffer overflow in foo() is detected.
I tried reproducing this with gcc. The example you list above
will generate very similar code on gcc. However, if I change main() to
not assign the return value of foo() to a local variable, I get a trivial
main main function without sanitizer code, but foo() still detects the
overflow:
=================================================================
==3442660==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x7fff5682bd58 at pc 0x0000004009c3 bp 0x7fff5682bd10 sp
0x7fff5682bd00
WRITE of size 4 at 0x7fff5682bd58 thread T0
#0 0x4009c2 in foo (/tmp/a.out+0x4009c2)
#1 0x4009dd in main (/tmp/a.out+0x4009dd)
#2 0x7fc6f0ce482f in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#3 0x400808 in _start (/tmp/a.out+0x400808)
Address 0x7fff5682bd58 is located in stack of thread T0 at offset 56 in frame
#0 0x4008c6 in foo (/tmp/a.out+0x4008c6)
This frame has 1 object(s):
[32, 52) 's' <== Memory access at offset 56 overflows this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/tmp/a.out+0x4009c2) in foo
The other difference is that with gcc, we only get one copy of each stack
variable even when it adds code from asan-stack to check it, while clang
generally creates one copy per instance when asan-stack=1 is set
(but doesn't otherwise).
> Now, I am also not confident that the reduced case reflects the real problem.
No, it probably doesn't, but it seems vaguely related. This is a problem
of using creduce, it sometimes reduces the test cases too much and
runs into a different symptom.
The case we frequently see in the kernel is probably more like
void extf(long *);
static inline __attribute__((always_inline)) void f(void)
{
long i = 7;
extg(&i);
}
int main(void)
{
f();
f();
}
https://godbolt.org/z/ReEQLo
Here, both gcc and clang add sanitizing for 'i', but clang has multiple
copies, while gcc only has one. This happens for both scalars and
structs, and inline functions as well as open-coded blocks or macros
with the same code in them. Using the inline function for illustration
above since that is the most common way it appears in the kernel.
Arnd