Re: [PATCH v2 03/11] tty: kbd: reduce stack size with KASAN

From: Arnd Bergmann
Date: Fri Jun 16 2017 - 16:57:10 EST

On Fri, Jun 16, 2017 at 7:29 PM, Dmitry Torokhov
<dmitry.torokhov@xxxxxxxxx> wrote:
> On Fri, Jun 16, 2017 at 8:58 AM, Samuel Thibault
> <samuel.thibault@xxxxxxxxxxxx> wrote:
>> Arnd Bergmann, on ven. 16 juin 2017 17:41:47 +0200, wrote:
>>> The problem are the 'ch' and 'flag' variables that are passed into
>>> tty_insert_flip_char by value, and from there into
>>> tty_insert_flip_string_flags by reference. In this case, kasan tries
>>> to detect whether tty_insert_flip_string_flags() does any out-of-bounds
>>> access on the pointers and adds 64 bytes redzone around each of
>>> the two variables.
>> Ouch.
>>> gcc-6.3.1 happens to inline 16 calls of tty_insert_flip_char() into
> I wonder if we should stop marking tty_insert_flip_char() as inline.

That would be an easy solution, yes. tty_insert_flip_char() was
apparently meant to be optimized for the fast path to completely
avoid calling into another function, but that fast path got a bit more
complex with commit acc0f67f307f ("tty: Halve flip buffer
GFP_ATOMIC memory consumption").

If we move it out of line, the fast path optimization goes away and
we could just have a simple implementation like

int tty_insert_flip_char(struct tty_port *port, unsigned char ch, char flag)
struct tty_buffer *tb = port->buf.tail;
int flags = (flag == TTY_NORMAL) ? TTYB_NORMAL : 0;

if (!__tty_buffer_request_room(port, 1, flags))
return 0;

if (~tb->flags & TTYB_NORMAL)
*flag_buf_ptr(tb, tb->used) = flag;
*char_buf_ptr(tb, tb->used++) = ch;

return 1;

One rather simple change I found would actually avoid the warning
and would seem to actually give us better runtime behavior even
without KASAN:

diff --git a/include/linux/tty_flip.h b/include/linux/tty_flip.h
index c28dd523f96e..15d03a14ad0f 100644
--- a/include/linux/tty_flip.h
+++ b/include/linux/tty_flip.h
@@ -26,7 +26,7 @@ static inline int tty_insert_flip_char(struct tty_port *port,
*char_buf_ptr(tb, tb->used++) = ch;
return 1;
- return tty_insert_flip_string_flags(port, &ch, &flag, 1);
+ return tty_insert_flip_string_fixed_flag(port, &ch, flag, 1);

static inline int tty_insert_flip_string(struct tty_port *port,

This reduces the stack frame size for kbd_event() to 1256 bytes,
which is well within the limit, and it lets us keep the flag-less
buffers across a 'tb->used >= tb->size' condition. Calling
into tty_insert_flip_string_flags() today will allocate a flag buffer
if there isn't already one, even when it is not needed.

>> I'm however afraid we'd have to mark a lot of static functions that way,
>> depending on the aggressivity of gcc... I'd indeed really argue that gcc
>> should consider stack usage when inlining.
>> static int f(int foo) {
>> char c[256];
>> g(c, foo);
>> }
>> is really not something that I'd want to see the compiler to inline.
> Why would not we want it be inlined? What we do not want us several
> calls having _separate_ instances of 'c' generated on the stack, all
> inlined calls should share 'c'. And of course if we have f1, f2, and
> f3 with c1, c2, and c3, GCC should not blow up the stack inlining and
> allocating stack for all 3 of them beforehand.
> But this all seems to me issue that should be solved in toolchain, not
> trying to play whack-a-mole with kernel sources.

The problem for the Samuel's example is that

a) the "--param asan-stack=1" option in KASAN does blow up the
stack, which is why the annotation is now called 'noinline_if_stackbloat'.

b) The toolchain cannot solve the problem, as most instances of the
problem (unlike kbd_put_queue) force the inlining unless you build
with the x86-specific CONFIG_OPTIMIZE_INLINING.