Question: BPF stack build-id lookup while holding mmap_lock
From: Runyu Xiao
Date: Wed Jun 17 2026 - 23:37:30 EST
Hi,
While auditing lock ordering around faultable build-id lookups, our
static analysis tool flagged the BPF stackmap user-build-id path, and we
manually reviewed it against the current tree.
The path we are concerned about is the sleepable helper path:
bpf_get_stack_sleepable() / bpf_get_task_stack_sleepable()
-> __bpf_get_stack(..., may_fault = true)
-> stack_map_get_build_id_offset()
-> mmap_read_trylock(current->mm)
-> build_id_parse(vma, ...)
-> __kernel_read()
`build_id_parse()` can read from the backing file while mmap_lock is
held. That can form an ABBA dependency with file read paths where the
inode side is held first and copy_to_user/copy_page_to_iter can fault
and then need mmap_lock.
A minimal Lockdep reproducer preserving this BPF stackmap carrier and
the reverse file-read edge reports:
WARNING: possible circular locking dependency detected
__kernel_read
stack_map_get_build_id_offset
__bpf_get_stack
*** DEADLOCK ***
The local fix I am considering is only for the faultable build-id path.
It would snapshot the VMA file reference and offset metadata under
mmap_lock, drop mmap_lock, and then parse the build-id from the file
reference with build_id_parse_file(). The existing no-fault path would
remain unchanged.
Roughly:
1. Under mmap_lock, find the VMA for each user IP.
2. Take a file reference and snapshot vm_start/vm_pgoff.
3. Drop mmap_lock.
4. Parse build IDs from the files.
5. Fall back to reporting IPs if the faultable path cannot safely
release mmap_lock or allocate the temporary snapshot array.
The tradeoff is that build-id parsing would happen after releasing
mmap_lock, so the VMA/file relationship is represented by the file
reference and copied metadata rather than by holding the VMA lock context
through the file read. That avoids file I/O under mmap_lock, but may
change edge-case behavior if the mapping changes concurrently.
Does this direction sound acceptable for sleepable BPF stack helpers, or
would you prefer a stricter fallback-to-IP behavior whenever build-id
parsing would require faultable file I/O? Another option would be to
avoid build-id parsing entirely in the may_fault=true stackmap path unless
there is an existing BPF/MM helper pattern I should reuse.
The local draft subject is:
bpf: avoid faultable build-id lookup under mmap_lock
Thanks,
Runyu