Re: [PATCH bpf v4 2/3] bpf: Avoid faultable build ID reads under mm locks
From: Ihor Solodrai
Date: Thu May 14 2026 - 18:50:30 EST
On 5/14/26 3:14 PM, Shakeel Butt wrote:
> On Thu, May 14, 2026 at 02:31:58PM -0700, Ihor Solodrai wrote:
>> On 5/14/26 1:47 PM, Shakeel Butt wrote:
>>> On Thu, May 14, 2026 at 11:47:26AM -0700, Ihor Solodrai wrote:
>>>> Sleepable build ID parsing can block in __kernel_read() [1], so the
>>>> stackmap sleepable path must not call it while holding mmap_lock or a
>>>> per-VMA read lock.
>>>>
>>>> The issue and the fix are conceptually similar to a recent procfs
>>>> patch [2].
>>>>
>>>> Resolve each covered VMA with a stable read-side reference, preferring
>>>> lock_vma_under_rcu() and falling back to mmap_read_trylock()
>>>
>>> Why trylock()? Why not just reuse the mechanism introduced in [2]? That is
>>> abstract out the mechanism introduced in [2] in mm core and reuse it.
>>
>> v1 used mmap_read_lock() as the fallback, but Puranjay pointed out [1]
>> that stackmap can be called when the caller already holds
>> mmap_lock.
>
> Can you exapnd on the scenario where caller already holds mmap_lock? Is this
> code path can be taken from bpf programs and bpf programs can be attached at
> functions/code-paths already holding the mmap_lock?
Exactly. For example, tracing BPF programs may attach to mm internals, and
there are also BPF iterators over VMAs (i.e. SEC("iter/task_vma")). And
probably other use-cases.
Bottom line is we can't assume it's safe to take mmap_lock in a code path
reachable from a BPF program.
>
>> So I changed to trylock since v2, similar to non-sleepable
>> path.
>>
>> AFAIU this means that the common mechanism needs to support trylock
>> behavior as in stack_map_lock_vma() in this patch.
>>
>> Do you think this is worth factoring out of stackmap.c?
>
> Do you think we will need similar handling at more places in future
> (particularly in bpf world) then it makes sense to factoring out now? Otherwise
> we can wait until we are sure.
Honestly, I don't know. IMO we shouldn't prematurely generalize a piece of
code with a single user. But I just may be unaware of the others.
>
>>
>> [1] https://lore.kernel.org/bpf/m25x611s17.fsf@xxxxxxxxxx/
>>
>
>
>