Re: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

From: Song Liu
Date: Thu Nov 05 2020 - 13:17:16 EST




> On Nov 4, 2020, at 6:25 PM, Daniel Xu <dxu@xxxxxxxxx> wrote:
>
> do_strncpy_from_user() may copy some extra bytes after the NUL

We have multiple use of "NUL" here, should be "NULL"?

> terminator into the destination buffer. This usually does not matter for
> normal string operations. However, when BPF programs key BPF maps with
> strings, this matters a lot.
>
> A BPF program may read strings from user memory by calling the
> bpf_probe_read_user_str() helper which eventually calls
> do_strncpy_from_user(). The program can then key a map with the
> resulting string. BPF map keys are fixed-width and string-agnostic,
> meaning that map keys are treated as a set of bytes.
>
> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> terminator, it can result in seemingly identical strings occupying
> multiple slots in a BPF map. This behavior is subtle and totally
> unexpected by the user.
>
> This commit uses the proper word-at-a-time APIs to avoid overcopying.
>
> Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
> Signed-off-by: Daniel Xu <dxu@xxxxxxxxx>
> ---
> lib/strncpy_from_user.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index e6d5fcc2cdf3..d084189eb05c 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
> goto byte_at_a_time;
>
> while (max >= sizeof(unsigned long)) {
> - unsigned long c, data;
> + unsigned long c, data, mask, *out;
>
> /* Fall back to byte-at-a-time if we get a page fault */
> unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
>
> - *(unsigned long *)(dst+res) = c;
> if (has_zero(c, &data, &constants)) {
> data = prep_zero_mask(c, data, &constants);
> data = create_zero_mask(data);
> + mask = zero_bytemask(data);
> + out = (unsigned long *)(dst+res);
> + *out = (*out & ~mask) | (c & mask);
> return res + find_zero(data);
> + } else {

This else clause is not needed, as we return in the if clause.

> + *(unsigned long *)(dst+res) = c;
> }
> +
> res += sizeof(unsigned long);
> max -= sizeof(unsigned long);
> }
> --
> 2.28.0
>