Re: [PATCH] bpf: remove pointless code from bpf_do_trace_printk()

From: Rasmus Villemoes
Date: Thu Apr 22 2021 - 03:13:06 EST


On 22/04/2021 05.32, Andrii Nakryiko wrote:
> On Wed, Apr 21, 2021 at 6:19 PM Rasmus Villemoes
> <linux@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> The comment is wrong. snprintf(buf, 16, "") and snprintf(buf, 16,
>> "%s", "") etc. will certainly put '\0' in buf[0]. The only case where
>> snprintf() does not guarantee a nul-terminated string is when it is
>> given a buffer size of 0 (which of course prevents it from writing
>> anything at all to the buffer).
>>
>> Remove it before it gets cargo-culted elsewhere.
>>
>> Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
>> ---
>> kernel/trace/bpf_trace.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>
> The change looks good to me, but please rebase it on top of the
> bpf-next tree. This is not a bug, so it doesn't have to go into the
> bpf tree. As it is right now, it doesn't apply cleanly onto bpf-next.

Thanks for the pointer. Looking in next-20210420, it seems to me that

commit d9c9e4db186ab4d81f84e6f22b225d333b9424e3
Author: Florent Revest <revest@xxxxxxxxxxxx>
Date: Mon Apr 19 17:52:38 2021 +0200

bpf: Factorize bpf_trace_printk and bpf_seq_printf

is buggy. In particular, these two snippets:

+#define BPF_CAST_FMT_ARG(arg_nb, args, mod) \
+ (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \
+ (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64) \
+ ? (u64)args[arg_nb] \
+ : (u32)args[arg_nb])


+ ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args,
mod),
+ BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2,
args, mod));

Regardless of the casts done in that macro, the type of the resulting
expression is that resulting from C promotion rules. And (foo ? (u64)bla
: (u32)blib) has type u64, which is thus the type the compiler uses when
building the vararg list being passed into snprintf(). C simply doesn't
allow you to change types at run-time in this way.

It probably works fine on x86-64, which passes the first six or so
argument in registers, va_start() puts those registers into the va_list
opaque structure, and when it comes time to do a va_arg(int), just the
lower 32 bits are used. It is broken on i386 and other architectures
where arguments are passed on the stack (and for x86-64 as well had
there been a few more arguments) and va_arg(ap, int) is essentially ({
int res = *(int *)ap; ap += 4; res; }) [or maybe it's -= 4 because stack
direction etc., that's not really relevant here].

Rasmus