Re: [PATCH bpf v3 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()
From: Paul Chaignon
Date: Thu Apr 16 2026 - 18:33:18 EST
On Thu, Apr 16, 2026 at 08:01:41PM +0800, Yihan Ding wrote:
> bpf_bprintf_prepare() only needs ASCII parsing for conversion
> specifiers. Plain text can safely carry bytes >= 0x80, so allow
> UTF-8 literals outside '%' sequences while keeping ASCII control
> bytes rejected and format specifiers ASCII-only.
>
> This keeps existing parsing rules for format directives unchanged,
> while allowing helpers such as bpf_trace_printk() to emit UTF-8
> literal text.
>
> Update test_snprintf_negative() in the same commit so selftests keep
> matching the new plain-text vs format-specifier split during bisection.
>
> Fixes: 48cac3f4a96d ("bpf: Implement formatted output helpers with bstr_printf")
> Signed-off-by: Yihan Ding <dingyihan@xxxxxxxxxxxxx>
> ---
> kernel/bpf/helpers.c | 17 ++++++++++++++++-
> .../testing/selftests/bpf/prog_tests/snprintf.c | 3 ++-
> 2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 6eb6c82ed2ee..d51f1b612f1d 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -845,7 +845,13 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
> data->buf = buffers->buf;
>
> for (i = 0; i < fmt_size; i++) {
> - if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
> + unsigned char c = fmt[i];
I'm a bit unsure this extra variable is worth it, but it's probably not
worth sending a v4 just for that.
> +
> + /*
> + * Permit bytes >= 0x80 in plain text so UTF-8 literals can pass
> + * through unchanged, while still rejecting ASCII control bytes.
> + */
> + if (isascii(c) && !isprint(c) && !isspace(c)) {
> err = -EINVAL;
> goto out;
> }
> @@ -867,6 +873,15 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
> * always access fmt[i + 1], in the worst case it will be a 0
> */
> i++;
> + c = fmt[i];
> + /*
> + * The format parser below only understands ASCII conversion
> + * specifiers and modifiers, so reject non-ASCII after '%'.
> + */
> + if (!isascii(c)) {
> + err = -EINVAL;
> + goto out;
> + }
>
> /* skip optional "[0 +-][num]" width formatting field */
> while (fmt[i] == '0' || fmt[i] == '+' || fmt[i] == '-' ||
> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> index 594441acb707..4e4a82d54f79 100644
> --- a/tools/testing/selftests/bpf/prog_tests/snprintf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> @@ -114,7 +114,8 @@ static void test_snprintf_negative(void)
> ASSERT_ERR(load_single_snprintf("%--------"), "invalid specifier 5");
> ASSERT_ERR(load_single_snprintf("%lc"), "invalid specifier 6");
> ASSERT_ERR(load_single_snprintf("%llc"), "invalid specifier 7");
> - ASSERT_ERR(load_single_snprintf("\x80"), "non ascii character");
> + ASSERT_OK(load_single_snprintf("\x80"), "non ascii plain text");
> + ASSERT_ERR(load_single_snprintf("%\x80"), "non ascii in specifier");
Acked-by: Paul Chaignon <paul.chaignon@xxxxxxxxx>
> ASSERT_ERR(load_single_snprintf("\x1"), "non printable character");
> ASSERT_ERR(load_single_snprintf("%p%"), "invalid specifier 8");
> ASSERT_ERR(load_single_snprintf("%s%"), "invalid specifier 9");
> --
> 2.20.1
>