Re: [PATCH v4] libbpf: fix UAF in strset__add_str()

From: Andrii Nakryiko

Date: Thu May 28 2026 - 17:37:19 EST


On Sat, May 23, 2026 at 9:27 AM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote:
>
> strset_add_str_mem() might reallocate the strset data buffer in order to
> accommodate the provided string 's'. However, if 's' points to a string
> already present in the buffer, it becomes dangling after the realloc.
> This leads to a use-after-free when attempting to memcpy() the string
> into the new buffer.
>
> One scenario that triggers this problematic path is when resolve_btfids
> attempts to patch kfunc prototypes using existing BTF parameter names:
>
> | resolve_btfids: function bpf_list_push_back_impl already exists in BTF
> | Segmentation fault (core dumped)
>
> Compiling resolve_btfids with fsanitize=address generates a detailed
> report of the UAF:
>
> | =================================================================
> | ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4
> | ==1507892==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4 at pc 0x55d25155a2a8 bp 0x7ffcef879060 sp 0x7ffcef878818
> | READ of size 5 at 0x7f4c4a500bd4 thread T0
> | #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids+0xcf2a7)
> | #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2
> | #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8
> | #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14
> | #4 0x55d25159f0b5 in process_kfunc_with_implicit_args tools/bpf/resolve_btfids/main.c:1196:9
> | #5 0x55d25159e004 in btf2btf tools/bpf/resolve_btfids/main.c:1229:9
> | #6 0x55d25159cee7 in main tools/bpf/resolve_btfids/main.c:1535:6
> | #7 0x7f4c78e29f76 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> | #8 0x7f4c78e2a026 in __libc_start_main csu/../csu/libc-start.c:360:3
> | #9 0x55d2514bb860 in _start (tools/bpf/resolve_btfids/resolve_btfids+0x30860)
> |
> | 0x7f4c4a500bd4 is located 13268 bytes inside of 2829000-byte region [0x7f4c4a4fd800,0x7f4c4a7b02c8)
> | freed by thread T0 here:
> | #0 0x55d25155b700 in realloc (tools/bpf/resolve_btfids/resolve_btfids+0xd0700)
> | #1 0x55d2515c426c in libbpf_reallocarray tools/lib/bpf/./libbpf_internal.h:220:9
> | #2 0x55d2515c426c in libbpf_add_mem tools/lib/bpf/btf.c:224:13
> |
> | previously allocated by thread T0 here:
> | #0 0x55d25155b2e3 in malloc (tools/bpf/resolve_btfids/resolve_btfids+0xd02e3)
> | #1 0x55d2515d6e7d in strset__new tools/lib/bpf/strset.c:58:20
>
> While resolve_btfids could be refactored to avoid this call path, let's
> instead fix this issue at the source in strset__add_str() and avoid
> similar scenarios.
>
> Let's check if set->strs_data was reallocated and whether 's' points to
> an internal string within the old strset buffer. In such case, 's' is
> reconstructed to point to the new buffer.
>
> While already here, also fix strset__find_str() which suffers from the
> same problem by factoring out the common operations into a new helper
> function strset_str_append().
>
> Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs")
> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> Suggested-by: Mykyta Yatsenko <yatsenko@xxxxxxxx>
> Signed-off-by: Carlos Llamas <cmllamas@xxxxxxxxxx>
> ---
> v4:
> Store pointers as integers in advance before realloc to prevent UB.
> Access set->strs_data directly, not through external API.
>
> v3:
> Switch to 's' reconstruction approach suggested by Andrii.
> Adjusted names and commit log accordingly.
> https://lore.kernel.org/all/20260518050550.2600101-1-cmllamas@xxxxxxxxxx/
>
> v2:
> Implemented the fix in strset__offset() helper as suggested by Mykyta.
> Added support to handle "substrings" of existing ones.
> Used 90d76d3ececc as Fixes tag as suggested by Sashiko.
> https://lore.kernel.org/all/20260515044759.2863546-1-cmllamas@xxxxxxxxxx/
>
> v1:
> https://lore.kernel.org/all/20260513232055.1681859-1-cmllamas@xxxxxxxxxx/
>
> tools/lib/bpf/strset.c | 59 +++++++++++++++++++++++++++---------------
> 1 file changed, 38 insertions(+), 21 deletions(-)
>
> diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
> index 2464bcbd04e0..b9faca828f09 100644
> --- a/tools/lib/bpf/strset.c
> +++ b/tools/lib/bpf/strset.c
> @@ -107,6 +107,38 @@ static void *strset_add_str_mem(struct strset *set, size_t add_sz)
> set->strs_data_len, set->strs_data_max_len, add_sz);
> }
>
> +static long strset_str_append(struct strset *set, const char *s)
> +{
> + uintptr_t old_data = (uintptr_t)set->strs_data;
> + uintptr_t old_s = (uintptr_t)s;
> + long len = strlen(s) + 1;
> + void *p;
> +
> + /* Hashmap keys are always offsets within set->strs_data, so to even
> + * look up some string from the "outside", we need to first append it
> + * at the end, so that it can be addressed with an offset. Luckily,
> + * until set->strs_data_len is incremented, that string is just a piece
> + * of garbage for the rest of the code, so no harm, no foul. On the
> + * other hand, if the string is unique, it's already appended and
> + * ready to be used, only a simple set->strs_data_len increment away.
> + */
> + p = strset_add_str_mem(set, len);
> + if (!p)
> + return -ENOMEM;
> +
> + /* The set->strs_data might have reallocated and if 's' pointed
> + * to an internal string within the old buffer, then it became
> + * dangling and needs to be reconstructed before the copy.
> + */
> + if (old_data && old_data != (uintptr_t)set->strs_data &&
> + old_s >= old_data && old_s < old_data + set->strs_data_len)

we should use ols strs_data_len here, I fixed it up (and comment style
issues pointed out by AI) like so:

$ git diff
diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c
index b9faca828f09..ace73c6b3d62 100644
--- a/tools/lib/bpf/strset.c
+++ b/tools/lib/bpf/strset.c
@@ -110,11 +110,13 @@ static void *strset_add_str_mem(struct strset
*set, size_t add_sz)
static long strset_str_append(struct strset *set, const char *s)
{
uintptr_t old_data = (uintptr_t)set->strs_data;
+ size_t old_data_len = set->strs_data_len;
uintptr_t old_s = (uintptr_t)s;
long len = strlen(s) + 1;
void *p;

- /* Hashmap keys are always offsets within set->strs_data, so to even
+ /*
+ * Hashmap keys are always offsets within set->strs_data, so to even
* look up some string from the "outside", we need to first append it
* at the end, so that it can be addressed with an offset. Luckily,
* until set->strs_data_len is incremented, that string is just a piece
@@ -126,12 +128,13 @@ static long strset_str_append(struct strset
*set, const char *s)
if (!p)
return -ENOMEM;

- /* The set->strs_data might have reallocated and if 's' pointed
+ /*
+ * The set->strs_data might have reallocated and if 's' pointed
* to an internal string within the old buffer, then it became
* dangling and needs to be reconstructed before the copy.
*/
if (old_data && old_data != (uintptr_t)set->strs_data &&
- old_s >= old_data && old_s < old_data + set->strs_data_len)
+ old_s >= old_data && old_s < old_data + old_data_len)
s = set->strs_data + (old_s - old_data);

memcpy(p, s, len);


applied to bpf-next, thanks!

> + s = set->strs_data + (old_s - old_data);
> +
> + memcpy(p, s, len);
> +
> + return len;
> +}
> +

[...]